1 # Notes on ARM & FPGA communications
5 - [Notes on ARM \& FPGA communications](#notes-on-arm--fpga-communications)
6 - [Table of Contents](#table-of-contents)
7 - [INTERFACE FROM THE ARM TO THE FPGA](#interface-from-the-arm-to-the-fpga)
9 - [FPGA modes](#fpga-modes)
10 - [ARM FPGA communications](#arm-fpga-communications)
11 - [ARM GPIO setup](#arm-gpio-setup)
12 - [FPGA Setup](#fpga-setup)
13 - [HARDWARE OVERVIEW](#hardware-overview)
14 - [ADC (ANALOG TO DIGITAL CONVERTER)](#adc-analog-to-digital-converter)
15 - [FIELD PROGRAMMABLE GATE ARRAY, FPGA](#field-programmable-gate-array-fpga)
16 - [MICROCONTROLLER](#microcontroller)
18 - [To behave like a READER](#to-behave-like-a-reader)
19 - [To behave like a TAG](#to-behave-like-a-tag)
20 - [To sniff traffic](#to-sniff-traffic)
21 - [FPGA purpose](#fpga-purpose)
25 https://github.com/RfidResearchGroup/proxmark3/blob/master/doc/original_proxmark3/proxmark3.pdf
27 INTERFACE FROM THE ARM TO THE FPGA
28 ==================================
30 The FPGA and the ARM can communicate in two main ways: using the ARM's
31 general-purpose synchronous serial port (the SSP), or using the ARM's
32 SPI port. The SPI port is used to configure the FPGA. The ARM writes a
33 configuration word to the FPGA, which determines what operation will
34 be performed (e.g. read 13.56 MHz vs. read 125 kHz vs. read 134 kHz
35 vs...). The SPI is used exclusively for configuration.
37 The SSP is used for actual data sent over the air. The ARM's SSP can
38 work in slave mode, which means that we can send the data using clocks
39 generated by the FPGA (either from the PCK0 clock, which the ARM itself
40 supplies, or from the 13.56 MHz clock, which is certainly not going to
41 be synchronous to anything in the ARM), which saves synchronizing logic
42 in the FPGA. The SSP is bi-directional and full-duplex.
45 The FPGA communicates with the ARM through either
46 1) SPI port (the ARM is the master)
47 2) SSC synchronous serial port (the ARM is the master).
50 opamps, (*note, this affects source code in ARM, calculating actual voltage from antenna. Manufacturers never report what they use to much frustration)
54 LF analog path (MCP6294 opamp. This has a GBW of 10 MHz), all 'slow' signals. Used for low frequency signals. Follows the peak detector. Signal centered around generated voltage Vmid.
60 Since the SPARTAN II is a old outdated FPGA, thus is very limited resource there was a need to split LF and HF functionality into two separate FPGA images. Which are stored in ARM flash memory as bitstreams.
62 We swap between these images by flashing fpga from ARM on the go. It takes about 1sec. Hence its usually a bad idea to program your device to continuously execute LF alt HF commands.
64 The FPGA images is precompiled and located inside the /fpga folder.
68 There is very rarely changes to the images so there is no need to setup a fpga tool chain to compile it yourself.
69 Since the FPGA is very old, the Xilinx WebPack ISE 10.1 is the last working tool chain. You can download this legacy development on Xilinx and register for a free product installation id.
70 Or use mine `11LTAJ5ZJK3PXTUBMF0C0J6C4` The package to download is about 7Gb and linux based. Though I recently managed to install it on WSL for Windows 10.
72 There is a docker image with webpack installed which has been built which you can use to easily compile the images:
75 docker pull nhutton/prox-container:webp_image_complete
76 docker run -v <LOCAL_PATH>/proxmark3:/tmp --rm -it nhutton/prox-container:webp_image_complete bash
77 $ cd /tmp/proxmark/fpga
81 In order to save space, these fpga images are LZ4 compressed and included in the fullimage.elf file when compiling the ARM SRC. `make armsrc`
82 This means we save some precious space on the ARM but its a bit more complex when flashing to fpga since it has to decompress on the fly.
91 ## ARM FPGA communications
94 The ARM talks with FPGA over the Synchronous Serial Port (SSC) rx an tx.
96 ARM, send a 16bit configuration with fits the select major mode.
103 // First configure the GPIOs, and get ourselves a clock.
104 AT91C_BASE_PIOA->PIO_ASR =
109 AT91C_BASE_PIOA->PIO_PDR = GPIO_SSC_DOUT;
111 AT91C_BASE_PMC->PMC_PCER = (1 << AT91C_ID_SSC);
113 // Now set up the SSC proper, starting from a known state.
114 AT91C_BASE_SSC->SSC_CR = AT91C_SSC_SWRST;
116 // RX clock comes from TX clock, RX starts on Transmit Start,
117 // data and frame signal is sampled on falling edge of RK
118 AT91C_BASE_SSC->SSC_RCMR = SSC_CLOCK_MODE_SELECT(1) | SSC_CLOCK_MODE_START(1);
120 // 8, 16 or 32 bits per transfer, no loopback, MSB first, 1 transfer per sync
121 // pulse, no output sync
122 if ((FPGA_mode & FPGA_MAJOR_MODE_MASK) == FPGA_MAJOR_MODE_HF_READER && FpgaGetCurrent() == FPGA_BITSTREAM_HF) {
123 AT91C_BASE_SSC->SSC_RFMR = SSC_FRAME_MODE_BITS_IN_WORD(16) | AT91C_SSC_MSBF | SSC_FRAME_MODE_WORDS_PER_TRANSFER(0);
125 AT91C_BASE_SSC->SSC_RFMR = SSC_FRAME_MODE_BITS_IN_WORD(8) | AT91C_SSC_MSBF | SSC_FRAME_MODE_WORDS_PER_TRANSFER(0);
128 // TX clock comes from TK pin, no clock output, outputs change on rising edge of TK,
129 // TF (frame sync) is sampled on falling edge of TK, start TX on rising edge of TF
130 AT91C_BASE_SSC->SSC_TCMR = SSC_CLOCK_MODE_SELECT(2) | SSC_CLOCK_MODE_START(5);
132 // tx framing is the same as the rx framing
133 AT91C_BASE_SSC->SSC_TFMR = AT91C_BASE_SSC->SSC_RFMR;
141 // Set up DMA to receive samples from the FPGA. We will use the PDC, with
142 // a single buffer as a circular buffer (so that we just chain back to
150 ## ADC (ANALOG TO DIGITAL CONVERTER)
153 The analogue signal that comes from the antenna circuit is fed into an 8-bit Analogue to Digital Converter
154 (ADC). This delivers 8 output bits in parallel which represent the current voltage retrieved from the field.
157 ## FIELD PROGRAMMABLE GATE ARRAY, FPGA
160 The 8 output pins from the ADC are connected to 8 pins of the Field Programmable Gate Array (FPGA). An
161 FPGA has a great advantage over a normal microcontroller in the sense that it emulates hardware. A
162 hardware description can be compiled and flashed into an FPGA.
164 Because basic arithmetic functions can be performed fast and in parallel by an FPGA it is faster than an
165 implementation on a normal microcontroller. Only a real hardware implementation would be faster but
166 this lacks the flexibility of an FPGA.
168 The FPGA can therefore be seen as dynamic hardware. It is possible to make a hardware design and flash
169 it into the memory of the FPGA. This gives some major advantages:
172 - "Hardware" errors can be corrected; the FPGA can be flashed with a new hardware design.
173 - Although not as fast as a real hardware implementation, an FPGA is faster than its equivalent on microprocessor. That is, it is specialized for one job.
175 The FPGA has two main tasks. The first task is to demodulate the signal received from the ADC and relay
176 this as a digital encoded signal to the ARM. Depending on the task this might be the demodulation of a
177 100% Amplitude Shift Keying (ASK) signal from the reader or the load modulation of a card. The encoding
178 schemes used to communicate the signal to the ARM are Modified Miller for the reader and Manchester
179 encoding for the card signal.
181 The second task is to modulate an encoded signal that is received from the ARM into the field of the
182 antenna. This can be both the encoding of reader messages or card messages. For reader messages the
183 FPGA generates an electromagnetic field on power hi and drops the amplitude for short periods.
189 The microcontroller is responsible for the protocol management. It receives the digital encoded signals
190 from the FPGA and decodes them. The decoded signals can just be copied to a buffer in the EEPROM
191 memory. Additionally, an answer to the received message can be send by encoding a reply and
192 communicating this to the FPGA.
194 The microcontroller (ARM) implements the transport layer. First it decodes the samples received from
195 the FPGA. These samples are stored in a Direct Memory Access (DMA) buffer. The samples are binary
196 sequences that represent whether the signal was high or low. The software on the ARM tries to decode
197 these samples. When the Proxmark3 is in sniffing mode this is done for both the Manchester and Modified
198 Miller at the same time. Whenever one of the decoding procedures returns a valid message, this message
199 is stored in another buffer (BigBuf) and both decoding procedures are set to an un-synced state. The
200 BigBuf is limited to the available memory on the ARM. The current firmware has 2 KB of memory
201 reserved for traces (Besides the trace, the buffer also stores some temporary data that is needed in the
202 processing). When the BigBuf buffer is full the function normally returns. A new function call from the
203 client is needed to download the BigBuf contents to the computer. The BigBuf is especially useful for
204 protocol investigation. Every single message is stored in this buffer. When a card is emulated or when the
205 Proxmark is used as a reader the BigBuf can be used to store status messages or protocol exceptions.
209 -- ANTENNA -> rectifying -> lowpass filter -> ADC -> FPGA -> ARM -> USB/CDC | FPC -> CLIENT
211 induct peak detect (8bit) -- modes:
212 via circuit HF - peak-detected
221 -- ANTENNA -> rectifying -> lowpass filter -> ADC -> FPGA -> ARM -> USB/CDC | FPC -> CLIENT
223 induct peak detect (8bit) -- modes:
224 via circuit LF - peak-detected
228 1. dynamic range of signal. Ie: High Carrier signal (reader) and low
233 ## To behave like a READER
236 By driving all of the buffers LOW, it is possible to make the antenna
237 look to the receive path like a parallel LC circuit; this provides a
238 high-voltage output signal. This is typically what will be done when we
239 are not actively transmitting a carrier (i.e., behaving as a reader).
241 ## To behave like a TAG
244 On the receive side, there are two possibilities, which are selected by
245 RLY1. A mechanical relay is used, because the signal from the antenna is
246 likely to be more positive or negative than the highest or lowest supply
247 voltages on-board. In the usual case (PEAK-DETECTED mode), the received
248 signal is peak-detected by an analog circuit, then filtered slightly,
249 and then digitized by the ADC. This is the case for both the low- and
250 high-frequency paths, although the details of the circuits for the
251 two cases are somewhat different. This receive path would typically
252 be selected when the device is behaving as a reader, or when it is
253 eavesdropping at close range.
255 It is also possible to digitize the signal from the antenna directly (RAW
256 mode), after passing it through a gain stage. This is more likely to be
257 useful in reading signals at long range, but the available dynamic range
258 will be poor, since it is limited by the 8-bit A/D.
260 In either case, an analog signal is digitized by the ADC, and
261 from there goes in to the FPGA. The FPGA is big enough that it
262 can perform DSP operations itself. For some high-frequency standards,
263 the subcarriers are fast enough that it would be inconvenient to do all
264 the math on a general-purpose CPU. The FPGA can therefore correlate for
265 the desired signal itself, and simply report the total to the ARM. For
266 low-frequency tags, it probably makes sense just to pass data straight
269 The FPGA communicates with the ARM through either its SPI port (the ARM
270 is the master) or its generic synchronous serial port (again, the ARM
271 is the master). The ARM connects to the outside world over USB.
281 Digital signal processing.
282 In short, apply low pass / hi pass filtering, peak detect, correlate signal meaning IQ pair collecting.
284 IQ means measure at In-phase and 90 phase shift later Quadrature-phase, with IQ samples you can plot the signal on a vector plan.
288 IQ1 = 1,1 : 1, -1 (rising)
289 IQ2 = -1,1 : 1, 1 (falling)
297 ----------0------------>