Showing posts from 2017

Machine Learning Optimizes FPGA Timing

By Bernard Murphy (*)

Machine learning (ML) is the hot new technology of our time so EDA development teams are eagerly searching for new ways to optimize various facets of design using ML to distill wisdom from the mountains of data generated in previous designs. Pre-ML, we had little interest in historical data and would mostly look only at localized comparisons with recent runs to decide whatever we felt were best-case implementations. Now, prompted by demonstrated ML-value in other domains, we are starting to look for hidden intelligence in a broader range of data.

One such direction uses machine-learning methods to find a path to optimization. Plunify does this with their InTime optimizer for FPGA design. The tool operates as a plugin to a variety of standard FPGA design tools but does the clever part in the cloud (private or public at your choice), in which the goal is to provide optimized strategies for synthesis and place and route.

There is a very limited way to do this toda…


FPGAs and GPUs: a Tour of SETI's Computer Hardware

David MacMahon is a research astronomer with Berkeley SETI Research Center. Dave works on several projects at BSRC, including Breakthrough Listen, designing many of the computer systems we use to process data collected from our telescopes. If you've ever been curious what hardware is required to search for ET, check out this tour of Berkeley SETI behind the scenes.

Slight Street Sign Modifications Can Completely Fool Machine Learning Algorithms

Machine Learning is a hot I+D topic.

One of the strategies used in Machine Learning is to learn by means of neural networks. You can get a free introduction to neural networks here.
I also warmly recommend Andrew Ng's introductory course to Machine Learning on Coursera.

Machine Learning neural networks were inspired by biological neural networks, and are easily applied but highly effective in image processing algorithms, like handwritten text recognition.

More complex neural networks algorithms are being implemented on what is called Deep Machine Learning, using neural networks with many layers of complexity.

Typically a neural network is trained, or it learns, from its exposure to thousands of 'good' and 'bad' examples of the image to be recognized or classfied. For example, a neural network that has to recognize handwritten numbers, will be exposed to thousands of examples of numbers written by different people, and even with changes in the orientation of the te…

Best FPGA development practices - Whitepaper

This whitepaper by Charles Fulk and RC Cofer is an excellent summary of several techniques, tools and design guidelines for FPGA:

FPGA design processRevision controlCoding guidelinesScripting automationPCB design for FPGAVHDL capture and simulation (including OS-VVM package)Project ManagementDesign Resources The whitepaper is available here

Xilinx AXI Stream tutorial - Part 2

Hi again,

On the previous chapter of this tutorial we presented the AXI Streaming interface, its main signals and some of its applications.

Now let's go for the funnier stuff, that is, to actually make and test some VHDL code to implement our AXI master. We will proceed gradually, adding features as we go. At the end of this tutorial you will have code that:
Implements an AXI master with variable packet lengthFlow control support (ready and valid)Option for generation of several kinds of data patternsTestbench to check that all features work OKInclude an instantiation of Xilinx's AXI Stream protocol checker IP to verify the correctness of our AXI master core.

So let's see the first version of an AXI master. In this version we will have fixed data length of the packet, and the data will be a progression of ascending numbers (the same counter that controls that the packet length is reached, is used to generate the packet data):

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1…

Xilinx AXI Stream tutorial - Part 1


In these series of articles I am going to present the design of an AXI4-Stream master. As I often do in my tutorials, I will try to show the design procedure for the block, starting from a "bare bones" solution and gradually adding features to it.

Xilinx provides a wide range of AXI peripherals/IPs from which to choose. My purpose in making my own block was in learning 'hands-on' the protocol. As a side effect, this tutorial provides you with a (synthesizable) AXI4 Stream master which I have not seen provided by Xilinx. The closest IP provided by Xilinx, that I know of, is an AXI memory mapped to AXI stream block.

But first things first, what is AXI4-streaming? Streaming is a way of sending data from one block to another. The idea on streaming devices is to provide a steady flow of high speed data, so usually one new block of data is transferred every clock pulse. Also, to reduce overhead streaming buses do no have addressing. Streaming connections are point to …

Spartan 7 now available

"Xilinx announced today that its Spartan-7 family of FPGAs is now available for order and shipping to standard lead times. As a key member of Xilinx's Cost-Optimized Portfolio, this device family is designed to meet the needs of cost-sensitive markets by delivering low cost and low power entry points that are I/O optimized for connectivity with industry leading performance-per-watt"

For more information:
Spartan-7 general availability announcement
Spartan-7 device page

Match rockets


Introduction to Verilog

This blog motto is FPGA projects in VHDL. It also includes free VHDL books. But, in a past comment on Hacker News I saw this nice Verilog short guide and I knew I have to share it here.

Besides, the FPGA world is evolving. In my experience, it is not enough to know one manufacturer. You may need more. Nor it is enough to know only VHDL. You'd better know also Verilog. And even start thinking to learn other resources and tools, like HLS.

So, now that I have finished (more or less) for publishing this link, here it is

The short, 32 page guide includes the following subjects:

Gate-Level ModellingData TypesOperatorsOperandsModulesBehavioral ModelingFunctionsComponent InferenceFinite State MachinesCompiler DirectivesSystem Tasks and FunctionsTest Benches, andMemories This short and useful Introduction to Verilog is published by Carleton University 

DTMF encoding and decoding

Dual Tone Multi-Frequency (DTMF) is a method for encoding and decoding up to sixteen digits and special characters to be sent over a voice channel.

DTMF was first developed by Bell Systems in United States, for use in push-button dialing telephones (in constrant to prior phones, which had a mechanic rotary dialing system).

DTMF is standardized by ITU-T Recommendation Q.23

A DTMF keypad consists of a matrix of sixteen push buttons organized in four rows by four columns. Each button, when pressed, generates a pair of tones. The tones belong to two groups, a low frequency group (697 to 941 Hz) and a high frequency group (1209 to 1633 Hz). On the picture below you can see the buttons and associated frequencies:

Pressing an '8' generates two tones, one low frequency tone of 852 Hz and one high frequency tone of 1336Hz.

The frequencies were selected so no one would be an harmonic of another DTMF frequency.

As it usually is for many digital transmission systems, the encoder is quite …

Timers block - Part three

In the first part of this tutorial, we commented about the implementation of a single timer.
The second part presented the implementation of a register based timers block,

In this (third) part of the tutorial we will see a different way to implement the timers block. The timers block is a rather thirsty animal, let's see how many resources it needs for several configurations:

Quantity of LUTQuantity of FFSingle 32 bit timer 43 3316 x 32 bit timers block 704 52832 x 32 bit timers block 1,408 1,05632 x 64 bit timers block 2,848 2,080

These numbers can be obtained by changing the DATA_W and TIMERS parameters on the VHDL package file and running synthesis for each configuration. After synthesis, in Vivado, we can get the number of used resources by taking a look at "Report utilization".

A single 32 bit timer takes 33 flip-flops which is quite reasonable. Thirty-two are needed for the timer alone. As the quantity of timers increases (or their width, or both), the quantity of F…

FPGA internal tri-state buses

For many designers, the first time we saw the internal memory blocks in an FPGA came as a little shock.

Some of us were used to RAM devices used in Board Design. These devices use bidirectional data buses. Even the fastest memories, DDRn DRAMs, use bidirectional data buses ('n' has changed over the years, from plain DDR to current DDR4).

So, how comes that internal memories on an FPGA have TWO data buses? Isn't that a waste of resources? Why don't FPGAs have internal tri-state buses?

Well, until around fifteen years ago, some FPGA devices DID have internal tri-state buffers. With the evolution of semiconductors technology, internal tri-state buffers were abandoned. So today, FPGAs don't have tri-state buffers but have unidirectional buses only. Since most memories are readable and writable, two unidirectional data buses are needed between a controller (CPU, internal FPGA logic) and the memory.

If this answer is enough for you, you can stop reading here. If you wan…

The MicroZed chronicles - free FPGA book

Adam Taylor is the well known author of the MicroZed Chronicles blog on Xilinx website. His Chronicles have been running for several years, and Adam has already compiled entries from his blog in two books. The first book is offered for free on the FPGARelated website for registered users.

This is a partial list of the book contents:

Introduction to the ZynqSoftware environment and configurationThe Boot loaderXADCMultiplexed IOTimers, clocks and watchdogsProcessing System and Programmable LogicDMAAdding an Operating SystemMultiProcessingetc.The book can be find here. Author Adam Taylor is a regular contributor on Xilinx Xcell Daily Blog and he also has his own website.

Xilinx Announces General Availability of Virtex UltraScale+ FPGAs in Amazon EC2 F1 Instances

"Xilinx today announced that its high-performance Xilinx® Virtex® UltraScale+™ FPGAs are available in Amazon Elastic Compute Cloud (Amazon EC2) F1 instances. This instance provides programmable hardware acceleration with FPGAs and enables users to optimize their compute resources for the unique requirements of their workloads...

... F1 instances will be used to solve complex science, engineering, and business problems that require high bandwidth, enhanced networking, and very high compute capabilities. They are particularly beneficial for applications that are time sensitive such as clinical genomics, financial analytics, video processing, big data, security, and machine learning. "

The Virtex Ultrascale+ family is based on the new 16 nm FinFET+ technology, and has the following features:
Up to 8GB of HBM Gen2 integrated in-package Up to 500Mb of on-chip memory integration Integrated 100G Ethernet MAC with RS-FEC and 150G Interlaken coresUp to four speed-grade improvement wit…

Size of wind turbines over the years


Square waveform generator

On the following three-part tutorial, a square waveform generator is presented.
The requirements for the project are to generate a sequence of square waveforms with different frequencies. For each frequency a number of cycles is generated (different for each one). For each frequency, a distinct duty cycle is also defined.

In this implementation the frequency is defined in Hz., and the active high time in ns. The VHDL code does not validate the parameters, i.e, if the active high time for any frequency is longer than its period, the output will be always '1' for that frequency. For each frequency, a number of cycles is defined.

This project was born over a discussion in Xilinx forums. Once I did the project for a specific configuration I started thinking about a way to make a generic solution, and this tutorial tries to reflect the design process of this small project.

The code is presented below. Three different frequencies FREQ1..3 are defined for this example, 242KHz, 23kHz …

The single biggest reason why startups succeed

Big Gross (himself an entrepreneur who has founded quite a few startups) analyzes the main reasons for success and failure of startups, and pinpoints a surprising main reason for that.

As time goes by

The comics strip says it all...
... or maybe not.

There are so many missing:

Ethernet, Fast Ethernet, Gigabit EthernetThe diverse Windows versions... and Unix, and LinuxSerial Rapid IO, InfinibandPDAs, disc-man, MP3-player, tablets, digital camerasADSL, optic fiber, and so many others...

BTW, I first thought about this comics strip while reading about IoT. Will it be a success or will it be forgotten?
I remember that in its time, ATM was also thought to be "the next thing, for sure". Well, it didn't happen...
What would your list have?

VHDL or Verilog?

This question gets asked again and again, by beginners and experienced designers alike. When I saw it posted on the FPGA group at reddit at reddit some time ago, I liked the answer from user fft32, so with his permission, I reproduce it here with some minor changes and additions. VHDL compared to VerilogVHDL: A bit verbose, clunky syntax. I never liked that different constructs have different rules for the "end" tag, like "end synth" for architectures, versus "end component mux" for components. I always find myself looking up the syntax of packages and functions.Strongly typed: It's a bit of a pain to have to make a (0 downto 0) vector to do something like a carry-in, but at the end of the day, it can save you time debugging problems. You don't scratch your head as to why your 10-bit vector is only 0 to 1, because you assigned a 1-bit value to it (a thing you could do in Verilog, but in VHDL would produce a compile error). Also, by default Verilog…