A new approach using DDS

This is a guest post by André who has come up with a novel approach using Direct Digital Synthesis of the chroma signal. The circuit is relatively simple and uses few components. It’s early days, but he’s already achieved promising results. Take a look at how he does it!

DDS prototype

This is a report of my attempt at creating a simple, low component overlay system with colour, for use in remote controlled vehicles, in an attempt to change the current go to hardware for doing video overlay, namely the character only monochrome Max OSD chip.

After much research and discussion with fellow Batperson from the WekaOSD blog I set out to devise a solution for colour overlay that uses the smallest number of components possible.

In the first part of the project I set out to create a monochrome OSD using the DSPIC33CK, as it is the MCU that I’m most comfortable with, despite the buggy IDE, because of the sheer number of peripherals it has:  comparators with DAC references, timers, capture and compare modules, ADC@3.5MSPS. That way I could integrate as many analog blocks as possible, avoiding noise issues and lowering part count. It is also a 100MIPS MCU, the majority of the instructions are single cycle, and interrupt latency is fixed if you obey some restrictions.

For sync detection I used an internal comparator fed by the video signal, where the reference voltage was controlled by the comparator dedicated DAC, to separate any unwanted signals the comparator input would present on the video feed. I used a solution inspired by the schematic of Dennis Frie’s DIY OSD. The diode is a standard 1N4148.


For rendering I chose an analog switch, namely the SN74LVC1G3157 SOT-23 (SPDT) single channel with  typical frequency of 340 MHz, connected in such a way that in case of MCU malfunction the video feed would pass on via the normally closed input. Rendering is made by an MCU pin on the normally open input. This input receives 3.3 volts via a 330 ohms resistor to create a signal above 0.3v which gave me a nice dark grey overlay. (Voltage divider with the 75 ohm termination resistor inside a monitor).

Then came the daunting task of generating colour, and here I made some bold assumptions. Simply put, my idea was to generate a 3.579545 MHz waveform using a frequency generator, and try to synchronize this to the detected chroma burst in an interrupt. In this way I would have a chroma signal in sync with the video feed, circumventing the need for phase-frequency lock using obsolete parts or a multiple chip solution. So I set out to find a frequency generator chip that was both small and  required only clock input from the MCU and some type of control SPI or I2C.

After some search on the web I decided to use a DDS chip, and following the constraints I imposed I settled on the AD9833 chip in TSSOP  10 pin format. To simplify prototyping I opted to buy a PCB module of the AD9833 and simply removed the 25Mhz crystal and fed it with a frequency generated by the MCU, so the DDS was configured to output the Chroma frequency, in square wave format to get the benefits of the 0 – 3.3v amplitude of the DDS output.


One particularity of this chip is that the best phase noise is achieved with a frequency output that is a whole divisor of the input clock, as in the first images I got the phase inconsistency was visible.

The solution was an undocumented register of the DSPIC – REFOTRIML, that allowed me to trim the frequency to about 7 times the chroma frequency, 25.056818Mhz. That solved the phase inconsistencies by using a multiple of the input clock of the DDS. That is also a slight overclock of the DDS (max 25Mhz).

DDS phase inconsistencies
DDS phase inconsistencies
DDS phase consistency

The next step required me to mix the dark grey luma signal created by the resistor divider with the generated clock signal, so I used the technique often found on resistor DACs, feeding the clock output via another resistor, effectively creating a crude DAC.


In my code I opted to synchronize with the incoming video signal in 3 phases:

first phase was to detect the video sync tip falling edge. That is accomplished with the comparator interrupt. This would provide a reference for the start of the overlay signal, by enabling a timer interrupt that would fire in the visible part of the scanline, and at the same time the code enables an input compare interrupt to detect the chroma burst cycles.

Second phase, the input compare interrupt fires on 4 rising edges. The first of these is the sync rising edge and the next 3 are the rising edges of the chroma burst. Right at the start of the interrupt the code resets and restarts the DDS at a defined phase, ideally one that would correct any frequency drift due to mismatched frequencies between the video effective chroma frequency and the local generated frequency in each scanline. Next, by reading the first buffered rising edge (the rising edge of sync tip)  it will determine if the signal is a scanline, short sync or long sync, and keep track of the current scanline number. Each of the options performs different actions. If it is in a scanline it hands off to the rendering interrupt, otherwise it resets the line counter and disables the render timer and render interrupt.

Third phase, the rendering timer interrupt, basically renders according to the line number. To transition to the different colours in the colour bars,  it pauses the clock feeding the DDS for 10ns (ideally; in actuality it takes more like 20ns). That equates to 14 shades of colour, theoretically it should be 28. At the end of this phase before returning from the interrupt, the timer and timer interrupt are disabled, and the comparator interrupt is re-enabled.


At present I still think it is still a complicated scheme of interrupts that needs further simplification. However although the results leave room for improvement, they are very encouraging.

NTSC Colour bars
14 hues of colour are being generated, overlaid onto a video signal. Some phase noise is present in the form of rolling vertical colour changes. However colour is consistent horizontally across each scan line.

Each column represents a 10ns delay in code, but in reality enabling and disabling the clock input of the DDS takes 2 cycles as it is visible that the colour cycles in approximately 14 cycles (each chroma cycle takes ~28 cycles) and there are contiguous columns that are the same colour despite the phase shift. As can be seen, there is phase shift vertically in each column, in the form of a colour shift rolling up along the columns.

Further investigation on jitter leads me to suspect the random nature of the colour shifts, but further investigation will be carried out, I have not scoped the signal completely. In the next figure, for each column the DDS input clock was stopped and a line rendered, depending on whether the output DDS clock was on a high level (white) or a low level (black). In theory the columns should alternate between white and black, and the lines should also alternate regularly between white and black due to the 180 degree phase shift between lines (in NTSC, each line should contain exactly 227.5 chroma subcarrier cycles). However as can be seen this is not the case. I think that because of the software-only line rendering routine some jitter may come from IF…THEN, and other C code instructions. Even utilizing some ASM did not mitigate the jitter.


In the next weeks or months, if time permits I will restructure the code and try to solve other issues, and attempt to minimize the rendering interrupt jitter using DMA instead of bit-banging. I still haven’t ruled out EMI on the comparator input as a source of jitter, but the comparator has a Software Hysteresis setting of none, 15, 30 or 45 mV and a digital filter with prescaler. Adjusting both did not improve the jitter.

Some solutions I thought about is putting an XOR gate on the DDS input frequency line to do proper 10ns delays, and supporting greyscale by putting another XOR gate between the DDS and the resistor DAC. Another thought is to clock the MCU at 3x27Mhz, as modern video monitors sample the video signal at 27Mhz.

Final thoughts: The current approach of generating colour shifts by delaying the DDS input clock 10ns (in practice 20ns) is feasible as a proof of concept but it would require code to keep track of the delays if arbitrary colours are to be generated. However I want a complete rendering interrupt that will simply fire the DMA or DMA’s and return to main, so that other tasks can be performed at the same time like reading a UART and creating a rendering buffer, perhaps a double buffer.

My regards.
Aka “Milstan”

Thanks to Batperson for the ideas shared, that gave me inspiration to create this prototype.
Thanks for Sónia for all the support and putting up with my nerdism.
Thanks to my family for everything.



Here is the schematic for the colour overlay prototype I have been breadboarding. The next logical step would be to design a proper circuit board. However I’m not happy with it for a number of reasons:

  • Too many ICs, too complex
  • Not able to support both video standards at the same time (need to select either a PAL or NTSC crystal)
  • Doesn’t work reliably with PAL, in any case because the PAL colourburst changes phase on each line and it is a 50/50 chance that the AD724 will change phase in the opposite direction to the incoming signal
  • The MC44144 is an obsolete part

I want to explore other alternatives. I have a few ideas. However for the time being I’m taking a break from analog video. My son needs an alarm clock so I’m going to build him one!

Updated Timing Board

Currently I’m waiting on parts, so there is not much for me to do project-wise at the moment. In the meantime I’ve fixed the errors on the video timing board. The MC44144 and LM1881 are separately AC-coupled to the CVBS input, and I’ve removed the clamp circuit. I also replaced the too-small 1.27mm pinheaders with normal-sized 0.254mm DuPont headers.


The Switch is Early

There is a delay of several hundred nanoseconds (I estimate 250-300, it is probably a multiple of the 4FSC period) between the RGB inputs and the output of the AD724. But the pixel switch activates almost immediately, before the overlay pixels are rendered. This is visible as a black bar to the left of the patterns below. The left pattern should be white-black stripes, and the right pattern should be white stripes only.

Notice the black bar to the left of the white bar. This is because the pixel switch is faster than the pixels.

I tried to eliminate this by putting an RC delay in front of the pixel switch. It does delay the switch, but the result is disappointing.

Delaying the switch with an RC network of 47pF and 4K. The delay is gone but there is a blue “halo” in its place, and the switch cannot toggle faster than the delay.

The worst problem is that the switch can’t toggle faster than the delay, which means overlays 1 or 2 pixels wide don’t even appear. I’m not sure what is causing the blue halo, it could be because the envelope for the switch signal is no longer a sharp square edge. Compare these 2 scope traces:

Note the sloping attack and decay. The last 2 peaks didn’t reach the threshold to activate the switch.

I need to find a way to delay the rise and fall of the switch signal equally, while preserving the sharp edges, even when the switch toggle time for 1 pixel (162ns) is less than the delay (250+ns).

The good folk at Stack Overflow have made some suggestions:

  1. Use a hex Schmitt trigger with up to 6 small RC networks, in series. One can be tunable with a variable cap.
  2. Simply use a clock delay IC such as the DS1110.
  3. Use one or more flip flops, clocked from the MC44144.

To which I added an idea of my own:

  1. Delay the signal in the microcontroller. This could be done by setting up a second DMA transfer to a different GPIO port. Of that, only enable 1 pin which will be the switch. Drive the DMA from a timer with the same period as the pixel clock, but started an arbitrary number of ticks later.

Of these, I like 2 and 3 for simplicity. 3 has the advantage that the switching will be in the same clock domain as the pixels. This may turn out to be important, otherwise the switching may still be in and out of sync with the overlay image.

My own idea would be a great one (if I do say so myself), if not for the fact that I will now have 2 DMA transfers contending with the CPU for the bus. From my reading, there are is an issue with the DMA2 controller on the STM32F4 series when concurrently accessing peripherals. So adding an extra stream willy nilly is something to be avoided. The finished product will need at least 1 other DMA transfer (to receive data from a UART), or most likely more if I end up integrating it with a flight controller.

I think I will try the flip-flop approach, and it will be an added incentive to clock the microcontroller from the pixel clock, if this is possible on a Discovery board. I will also experiment with 2 DMA transfers.

Edit: André in Portugal has suggested another way: use a comparator to monitor the overlay video output. If it rises above black level, the comparator activates the switch. Thanks André! This would be a bit like the “blue screen” chroma keying used to show the weatherman in front of a computer-generated weather map back in the old days, except this would be luma keying. It would free up an extra bit (along with the other spare bit I’m not currently using) allowing for more colours. But it would mean sacrificing the ability to draw black in the overlay. Everything is a tradeoff…..


Overlay, Take 2

Over the weekend, I tracked down the problem with poor saturation. This was due to an incorrect connection to ground in my clamp circuit. After doing this I had the opposite problem: the strong overlay signal was causing the monitor’s AGC to dim the rest of the picture. I compensated for this by increasing the series resistance after the AD724 from 75 ohms to 150 ohms. This didn’t alter the brightness of the overlay but the source video is now less dim.

I also used an RC circuit to shift the phase of the subcarrier clock, in an attempt to correct the colours. I experimented with different capacitor values and discovered that 1nF and above caused the clock to disappear completely. 20pF caused a barely perceptible shift in colour, but a 10K resistor with a 470pF capacitor gave correct red, green and blue colour bars. Subjectively they appear exactly as they should, however I will experiment further to see if there is any more scope for improvement. The colours now appear solid where before there was banding, I’m not sure why this has disappeared unless it was clock jitter that the RC network somehow mitigated.

Here is the result.

I have to say it’s looking a lot better than last week. The only issue still remaining is a switch artefact. Notice that there is a black vertical line before the red bar appears. This shows on the scope as a small bright spot just above sync level. Since there is no artefact when switching from overlay back to source video, I suspect it is the switch responding more quickly than the AD724, switching in a blank image before the colour signal has been generated. If so, I should be able to mitigate it by introducing a delay to the switch.


Well, sort of.

It’s a colour test pattern overlay, but there are a bunch of issues:

  • The colours are wrong. I had to change the order of the red, green and blue inputs to get even this result, because the phase delay from the MC44144 is 60 degrees. The colours are also poorly saturated, and look worse in real life than in the picture.
  • The image flickers, due to noise. This is most likely due to the use of a breadboard.
  • The source video appears washed out. I believe this may be partly the result of clamping which seems to compress the waveform by 200mV.

On the positive side, I did solve some problems:

  • I’m using an SN74LVC2G53 analog switch. This one is unbuffered. Since both the AD724 and the camera are both outputting at full-drive 2V p/p, there should be no need for buffering. The switch is very fast (<10ns) and seems to be performing as advertised.
  • Originally the 2 video signals were mismatched by around 200mV. This was enough to cause an unstable picture, as the overlay signal dropped below sync level. Worse, it even appeared to contaminate the source video in the LM1881, causing it to lose sync.

In order to continue with this approach, I will need to solve the subcarrier phase problem, and also find out why the colours look so terrible.

Another problem that has been bugging me is that the MC44144 often starts up without generating a subcarrier. It seems worse when using the USB power supply which is very noisy, although the filter on the board reduces it to < 5mV p/p. Both the chip and the crystal came from AliExpress, could it be a quality issue?


After resolving some DMA issues I now have a working testbed running on the STM32F413 mcu. It is using the signals from the timing board to drive an AD724. It is generating a good test pattern but connections are very finicky, since I have the AD724 on a breadboard. Noise is visible and jiggling the wires is sometimes necessary to get it to work. But it will do for the time being.

In theory the output from this should be in phase with the video from my Runcam. To see if that is the case or not I need to have a working pixel switch. When I attempted to use the MAX4313 I didn’t see a picture, and when i checked the output with the scope I saw that the sync tips are being clipped off. It is attenuating the waveform when it drops below 0 volts. I will try clamping the sync tips to 1 volt to see whether it will pass the entire waveform through.

Video on loss of sync

I would like WekaOSD have the capability to continue generating video if the signal from the camera is lost or a camera is not connected. For this, I will need to detect when
loss of sync occurs and have the MCU generate sync signals.

I have worked out a way I can do this. The sync input on the AD724 is driven by the CSYNC output of the LM1881. At the same time, CSYNC is connected to a pin on the MCU. During normal operation the MCU pin is configured as an input so the MCU can derive horizontal timing. If LOS occurs or there is no camera, the pin can be configured as an output. Since the LM1881 sync output appears to be open-drain, either the LM1881 or the MCU can pull the line low to generate a sync pulse. This means the MCU can easily generate its own sync signals with no risk of 2 pins driving the line at the same time.

I can detect LOS by monitoring the Burst Gate output from the LM1881 using a second MCU pin. When there is incoming video there will be a short pulse every 64 uS. If those pulses stop then there is no video, and the MCU can switch to generating sync pulses.

If I make the pixel switch pin open-drain with a pullup resistor, then when loss of sync occurs I can put the pin into input mode. This will have the effect of turning the pixel switch signal continuously on, regardless of the value for the current pixel.

New Boards Complete

It took a long time to get these boards made up and populated, and I ran into issues along the way due to some errors in the design. The first mistake was to use 1.27mm pinheader footprints instead of 2.54mm. This might have been great from a miniaturisation point of view but not for convenience of prototyping as I needed to make custom jumpers to connect the boards, and special adapters to work with the DuPont connectors that are normal for prototyping. More troublesome was the use of a single AC coupling capacitor on the video input. Each IC which receives the video signal needs its own AC coupling capacitor as they apply their own bias to the incoming signal and will interfere with each other. I had to perform surgery by scraping off the soldermask, cutting traces and soldering in extra caps before the LM1881 sync separator would extract meaningful sync signals. Once I get some free time I will correct the design files, but for now please don’t anybody use them! They are fixed now.

I have a PAL and NTSC version of the timing board. I intend to start with the NTSC version and use it to drive an AD724 video generator in sync with the colour subcarrier. I will then use another board (not shown) with a video switch controlled by the MCU to switch pixels.

Another possibility is to ditch the AD724 and generate composite video with discrete components, by applying a delay to the colour subcarrier. This is how the early home computers did it back in the 1980s. If time permits I will explore both options, but for now the aim is to get a proof of concept working. The next task on the agenda is to revisit the WekaOSD code and adapt it for the new hardware.


New Board Designs

New design is a timing board, which extracts all the signals needed for genlock but does not generate any video. There will also be a fast video switch on a separate board using a MAX4313 which is both a buffer and a switch. Switching time is around 40ns, much slower than the TI switch I was planning to use. But this may still be OK and does not require a negative supply. I will still make the negative supply board though and if the MAX4313 is not suitable I can make another board with the TI switch.

This is a modular approach. I intend to try generating the CVBS from the uC, but if this does not work out I can make another board with an AD724 as originally planned.

Circuit schematics are complete, currently working on the routing.

Assuming 52uS of analog video, 320*240 resolution means each pixel lasts 162.5nS. This means the MAX4313 which takes 40nS to switch, will take around 1/4 of a pixel when switching.

Video timing board design is uploaded to Github: