Flexible Solution for Broadband OFDM PHYs
Richard Thomas, Senior Design Engineer, Lattice Semiconductor UK Limited
Flexible Solution for Broadband OFDM PHYs
Lattice Semiconductor has developed an FPGA implementation of an OFDM transceiver design based on the core physical layer (PHY) requirements of the “WiMAX” 802.16-2004 OFDM specification. The use of a number of IP cores for already well defined DSP functions helped to reduce the development time from years to months.
Orthogonal Frequency Division Multiplexing (OFDM) transceivers are widely used in wireless applications including ETSI DVB-T/H digital terrestrial television transmission and IEEE network standards such as 802.11 (“WiFi”), 802.16 (“WiMAX”), 802.20 (proposed PHY). Such transceivers have large arithmetic processing requirements which can become prohibitive if implemented in software on a DSP processor. However, the highly pipelined nature of much of the processing lends itself well to a hardware implementation. A flexible solution with low NRE such as an FPGA implementation allows late changes to meet evolving standards. FPGAs also a offer well understood design flow, starting from HDLs via synthesis followed by place and route.
The OFDM Transceiver in a System

The diagram illustrates the scope of this OFDM base station PHY transceiver FPGA implementation within a complete modem, where fs = OFDM sample rate at the FFT, which is up to 11.424 MHz for the maximum supported nominal channel bandwidth of 10 MHz.
For this implementation, a single ADC (real-only) low-IF receiver input was chosen to avoid quadrature gain and phase mismatches associated with two (quadrature) ADC inputs. For the transmitter, quadrature DAC outputs to drive a direct conversion (or “zero-IF”) radio were implemented, although a low-IF single DAC output could be implemented with the inclusion of an additional mixer and fixed rate interpolating filter within the FPGA design.
A common control and data interface to the MAC layer processing was implemented using a "Wishbone" bus (an open source SoC bus standard) but users could easily adapt the design to a different interface.
Design Overview
The complete physical layer base station transceiver (PHY) specification was implemented in a single Lattice ECP33 FPGA and features:
§ Duplexing support for full duplex FDD, half-duplex FDD or TDD modes
§ Nominal channel bandwidth up to 10 MHz
§ Forward Error Correction (FEC): mandatory 802.16-2004 requirement of Reed-Solomon and Convolutional coding
§ Modulation and coding: support for the highest 802.16-2004 data rate 64-QAM-3/4 scheme
§ Transmitter has 16-bit Zero-IF complex DAC output, with sampling rate up to 11.424 MHz and 802.16 “long” and “short” preamble generation.
§ Receiver has 10-bit Low-IF ADC input, with sampling rate up to 22.848 MHz, burst detection and frequency / timing synchronization with carrier frequency and sample timing recovery well within 802.16-2004 tolerances.
The structure of the transceiver is illustrated below with blocks containing IP cores highlighted.
OFDM Transceiver FPGA PHY Sub-system structure

The Lattice ispLEVER (Windows) software suite was used to perform all development from HDL simulation through synthesis, mapping, place and route to final FPGA PROM programming file generation. Development of commonly occurring functions was avoided by using a number of Lattice IP cores in the design, including Reed-Solomon encoder/decoder, Viterbi decoder, FFT and FIR filters. Fixed point analysis was done using a Matlab model and overall performance simulations were run over many more symbols than could be practically simulated in an HDL simulator.
Making full use of ECP FPGA features
Multiple clock domains: two of the four on-chip PLLs were used to generate and align four separate clock domains for this design. These were a sample-rate clock running at fs = 11.424 MHz for a 10MHz nominal bandwidth and multiples of this; 2*fs, 8*fs, 12*fs, making the fastest clock 137 MHz. A fifth externally sourced (and independent) clock domain was used for the Wishbone interface clock. Suitably pipelined designs on ECP devices can run at clock speeds in excess of 200 MHz, but this was not required for this design.
Embedded Block RAMs (EBR): 9 kbit RAMs, with variable aspect ratios from 9k x 1 bit to 256 x 36 bit. These were used throughout the design to provide fast and efficient storage for large quantities of data, typically complete “data blocks” for each OFDM symbol at various stages in the pipeline. This allowed easy isolation of each pipeline processing block from the next, thereby simplifying block design specification.
Distributed RAMs: small 16 x 2 bit RAMs that can be used as an alternative to a single logic slice (effectively replacing 2 LUT4 elements). These were used extensively in the design for very small storage (they are much more efficient than flip-flops). In places, Distributed RAMs were ganged together to create moderate RAM sizes – such a technique allows trade-offs to be made between use of block RAMs or logic slices to optimise the design for the amount of remaining FPGA resources available.
DSP Modules: dedicated multipliers, adders and accumulator logic. These are particularly effective at performing complex multiplies using 2 multipliers in a multiply-add/sub configuration over 2 successive clock cycles. Because the adder is built into the DSP module rather than being constructed out of LUTs, multiply-add operations can be run at high speed for very high throughput.
Table 1: Total Resource Usage for this design mapped to a Lattice ECP33 FPGA
|
Resource |
Used |
Percent |
|
DSP blocks (29 multipliers used) |
5.75 |
72 |
|
Embedded Block RAMs (EBRs) |
38 |
70 |
|
Logic slices |
12589 |
77 |
|
LUT4 elements (of which 532 LUT4s used as Distributed RAMs) |
19872 |
61 |
|
PLLs |
2 |
50 |
Design Performance
A quantized Matlab reference transmitter model was used both to validate the performance of the HDL transmitter design and also to generate vectors to test the receiver. The Matlab channel model included a SUI multipath delay, phase noise, AWGN as well as carrier frequency and sample rate errors of up to +/-13% of sub-carrier spacing. The IEEE 802.16-2004 specification limits the allowable carrier frequency and sample rate errors at the base station receiver to no more than +/-2%. The 13% value modeled here is equivalent to Doppler shift caused by a transmitter or receiver travelling at 50 km/h with a nominal channel bandwidth of 1.75 MHz.
The 64-QAM-3/4 receiver sensitivity test requires a final bit error rate (BER) of less than 10-6 at an AWGN SNR of 24.4 dB. The figure below shows bit and packet (PER) error rate measurements at the output of various stages of the receiver. The plots show that the design meets the specification.
Receiver Performance ( 64-QAM-3/4) in AWGN channel

|