Fen Logic Ltd. Clock mux:
Module to safely (==glitch free) switch between two unrelated clocks. Switching from one clock to another is a tricky business. You must make sure that no pulses appear at the output which are too short. (Low or high.) This clock mux does it safely for you. The output goes low at the end of falling clock edge of the curently running clock, then it stays low for a period longer then the low period of either clock. It starts again (going high) on the rising edge of the newly selected clock.

Module to generate a clean reset release after a Crystal oscillator start up. Interestingly all Verilog example code assume a clean clock and a clean reset. Unfortunately reality is a bit different. It takes an, often unknown, period before a crystal is running stable and in the mean time you can get all kind of runt pulses from it. The same holds more or less for the reset. An external RC is converted to a hopefully clean rising edge using a Schmidtt trigger gate. This module can deal with runt pulses and jittery resets. It waits until N pulses have been seen after which it generates a clean reset at the output.

Streaming FFT:
Alpha versions now available!!
Streaming FFT with parameterized data width.
The standard module is a Radix-2 FFT, using the Cooley-Tukey system and Decimation in time. The FFT comes with support for integration into your design.
This is NOT a free module! For information and purchase
contact Fen Logic Ltd. On request you can send input streams and we will send you the result back so you check the validity of the module before purchase.
Xilinx (XC7Z030) synthesis runs at ~150MHz for 20 bits wide real and 20 bits wide imaginary data.
A streaming FFT sacrifices latency for area and the easiness with which additional stages can be added or removed. The following table holds the estimated latency in clock cycles for the various FFT blocks and FFT sizes. Two models with lower latency are under development.
Calc. Output
16 21 31 11 63
32 37 63 19 119
64 69 127 35 231 114 Low latency!
128 133 255 67 455
256 261 511 131 903
512 517 1023 259 1799
1024 1029 2047 515 3591
2048 2053 4095 1027 7175
4096 4101 8191 2051 14343
8192 8197 16383 4099 28679
16384 16389 32767 8195 57351
32768 32773 65535 16387 114695
65536 65541 131071 32771 229383
First results from the low-latency version are in. A 64-point FFT has a delay of about 114 clock cycles. (Versus ~230 for the normal version.) Memory reuirements have not been estimated yet.

Here is a table with memory requirements. These numbers have to be multiplied by the width of the data in bits. e.g. A 64 point FFT with 20 bit wide numbers requires 20*256=5120 memory bits of single ported memory for the input re-order buffer.
These figures can change depending on special circumstances: an FFT which takes only real data at its input requires only half the size of re-order input memory.
Mem* SP
Mem* DP
Mem* DP
16 64 16 16 8
32 128 32 32 16
64 256 64 64 32
128 512 128 128 64
256 1024 256 256 128
512 2048 512 512 256
1024 4096 1024 1024 512
2048 8192 2048 2048 1024
4096 16384 4096 406 2048
8192 32768 8192 8192 4096
16384 65536 16384 16384 8192
32768 131072 32768 32768 16387
65536 262144 65536 65536 32768
* Multiply by the word width in bits. (A complex number is seen as consisting of TWO words).
SP means single ported, DP means dual ported. These memories consists of multiple blocks thus extra space for power routing and control logic will be required.

Most of these modules and a lot more can be downloaded for free from www.verilog.pro The free modules come with no warranty as each will tell you in their header file. In contrast to other free Verilog code these modules are complete freeware. No obligations for opening up any other part of your code is entangled with it. They also come with their own test benches.
Other verilog modules you can find there are:
As with the software, if you want extra features I am willing to listen but limited in time so it may take a while. Again throwing money in my direction might help ☺.