This is a guest post by André who has come up with a novel approach using Direct Digital Synthesis of the chroma signal. The circuit is relatively simple and uses few components. It’s early days, but he’s already achieved promising results. Take a look at how he does it!
This is a report of my attempt at creating a simple, low component overlay system with colour, for use in remote controlled vehicles, in an attempt to change the current go to hardware for doing video overlay, namely the character only monochrome Max OSD chip.
After much research and discussion with fellow Batperson from the WekaOSD blog I set out to devise a solution for colour overlay that uses the smallest number of components possible.
In the first part of the project I set out to create a monochrome OSD using the DSPIC33CK, as it is the MCU that I’m most comfortable with, despite the buggy IDE, because of the sheer number of peripherals it has: comparators with DAC references, timers, capture and compare modules, ADC@3.5MSPS. That way I could integrate as many analog blocks as possible, avoiding noise issues and lowering part count. It is also a 100MIPS MCU, the majority of the instructions are single cycle, and interrupt latency is fixed if you obey some restrictions.
For sync detection I used an internal comparator fed by the video signal, where the reference voltage was controlled by the comparator dedicated DAC, to separate any unwanted signals the comparator input would present on the video feed. I used a solution inspired by the schematic of Dennis Frie’s DIY OSD. The diode is a standard 1N4148.
For rendering I chose an analog switch, namely the SN74LVC1G3157 SOT-23 (SPDT) single channel with typical frequency of 340 MHz, connected in such a way that in case of MCU malfunction the video feed would pass on via the normally closed input. Rendering is made by an MCU pin on the normally open input. This input receives 3.3 volts via a 330 ohms resistor to create a signal above 0.3v which gave me a nice dark grey overlay. (Voltage divider with the 75 ohm termination resistor inside a monitor).
Then came the daunting task of generating colour, and here I made some bold assumptions. Simply put, my idea was to generate a 3.579545 MHz waveform using a frequency generator, and try to synchronize this to the detected chroma burst in an interrupt. In this way I would have a chroma signal in sync with the video feed, circumventing the need for phase-frequency lock using obsolete parts or a multiple chip solution. So I set out to find a frequency generator chip that was both small and required only clock input from the MCU and some type of control SPI or I2C.
After some search on the web I decided to use a DDS chip, and following the constraints I imposed I settled on the AD9833 chip in TSSOP 10 pin format. To simplify prototyping I opted to buy a PCB module of the AD9833 and simply removed the 25Mhz crystal and fed it with a frequency generated by the MCU, so the DDS was configured to output the Chroma frequency, in square wave format to get the benefits of the 0 – 3.3v amplitude of the DDS output.
One particularity of this chip is that the best phase noise is achieved with a frequency output that is a whole divisor of the input clock, as in the first images I got the phase inconsistency was visible.
The solution was an undocumented register of the DSPIC – REFOTRIML, that allowed me to trim the frequency to about 7 times the chroma frequency, 25.056818Mhz. That solved the phase inconsistencies by using a multiple of the input clock of the DDS. That is also a slight overclock of the DDS (max 25Mhz).


The next step required me to mix the dark grey luma signal created by the resistor divider with the generated clock signal, so I used the technique often found on resistor DACs, feeding the clock output via another resistor, effectively creating a crude DAC.
In my code I opted to synchronize with the incoming video signal in 3 phases:
first phase was to detect the video sync tip falling edge. That is accomplished with the comparator interrupt. This would provide a reference for the start of the overlay signal, by enabling a timer interrupt that would fire in the visible part of the scanline, and at the same time the code enables an input compare interrupt to detect the chroma burst cycles.
Second phase, the input compare interrupt fires on 4 rising edges. The first of these is the sync rising edge and the next 3 are the rising edges of the chroma burst. Right at the start of the interrupt the code resets and restarts the DDS at a defined phase, ideally one that would correct any frequency drift due to mismatched frequencies between the video effective chroma frequency and the local generated frequency in each scanline. Next, by reading the first buffered rising edge (the rising edge of sync tip) it will determine if the signal is a scanline, short sync or long sync, and keep track of the current scanline number. Each of the options performs different actions. If it is in a scanline it hands off to the rendering interrupt, otherwise it resets the line counter and disables the render timer and render interrupt.
Third phase, the rendering timer interrupt, basically renders according to the line number. To transition to the different colours in the colour bars, it pauses the clock feeding the DDS for 10ns (ideally; in actuality it takes more like 20ns). That equates to 14 shades of colour, theoretically it should be 28. At the end of this phase before returning from the interrupt, the timer and timer interrupt are disabled, and the comparator interrupt is re-enabled.
At present I still think it is still a complicated scheme of interrupts that needs further simplification. However although the results leave room for improvement, they are very encouraging.

Each column represents a 10ns delay in code, but in reality enabling and disabling the clock input of the DDS takes 2 cycles as it is visible that the colour cycles in approximately 14 cycles (each chroma cycle takes ~28 cycles) and there are contiguous columns that are the same colour despite the phase shift. As can be seen, there is phase shift vertically in each column, in the form of a colour shift rolling up along the columns.
Further investigation on jitter leads me to suspect the random nature of the colour shifts, but further investigation will be carried out, I have not scoped the signal completely. In the next figure, for each column the DDS input clock was stopped and a line rendered, depending on whether the output DDS clock was on a high level (white) or a low level (black). In theory the columns should alternate between white and black, and the lines should also alternate regularly between white and black due to the 180 degree phase shift between lines (in NTSC, each line should contain exactly 227.5 chroma subcarrier cycles). However as can be seen this is not the case. I think that because of the software-only line rendering routine some jitter may come from IF…THEN, and other C code instructions. Even utilizing some ASM did not mitigate the jitter.
In the next weeks or months, if time permits I will restructure the code and try to solve other issues, and attempt to minimize the rendering interrupt jitter using DMA instead of bit-banging. I still haven’t ruled out EMI on the comparator input as a source of jitter, but the comparator has a Software Hysteresis setting of none, 15, 30 or 45 mV and a digital filter with prescaler. Adjusting both did not improve the jitter.
Some solutions I thought about is putting an XOR gate on the DDS input frequency line to do proper 10ns delays, and supporting greyscale by putting another XOR gate between the DDS and the resistor DAC. Another thought is to clock the MCU at 3x27Mhz, as modern video monitors sample the video signal at 27Mhz.
Final thoughts: The current approach of generating colour shifts by delaying the DDS input clock 10ns (in practice 20ns) is feasible as a proof of concept but it would require code to keep track of the delays if arbitrary colours are to be generated. However I want a complete rendering interrupt that will simply fire the DMA or DMA’s and return to main, so that other tasks can be performed at the same time like reading a UART and creating a rendering buffer, perhaps a double buffer.
My regards.
André
Aka “Milstan”
Thanks to Batperson for the ideas shared, that gave me inspiration to create this prototype.
Thanks for Sónia for all the support and putting up with my nerdism.
Thanks to my family for everything.