Gr0estl Algo on AtomMiner platform

We’re one step closer to Monero by prototyping Gr0estl algorithm to work on AtomMiner Platform. The idea was to build version Alpha prototype that works before we start making actual fast Gr0estl core. We’ve take GRS coin sources from couple of open-source projects like cpuminer and its forks and openCL kernel from cgminer and put our version of Gr0estl code into the Vivado HLS to synthesize FPGA-ready code.

The project turned to work in the FPGA right away giving us approx. 120-180kH/s which can be compared to 1 thread mining on my Core i5 M520 CPU wit the difference in power consumption. AtomMiner board consumes only 1.5W at the wall while mining GRS coin now.

"<yoastmark "<yoastmark
"<yoastmark "<yoastmark

I can see it right now, we have room for some optimizations and speed ups in the existing code before high throughput core will be released and ready to use. From another hand, looking at implementation stats I think it worth trying to fit couple of hashing cores in the FPGA chip to get better hashrate.

Technical Details

You can find project sources in our Github repository. The core contains double Gr0estl hash calculation as required by GRS coin and full loop on nonce from 0 to 0xFFFFFFFF.

As many of you might know, GRS is what is being called full header coin which means there’s no any kind of midstate or pre-calculation can be done for this algo. Here’s main module’s function declared in the code:

void groestl(uint64_t data0, uint64_t data1, uint64_t data2, uint64_t data3, uint64_t data4,
	uint64_t data5, uint64_t data6, uint64_t data7, uint64_t data8, uint64_t data9, uint32_t target7, uint32_t target6, uint32_t startnonce,
	ap_int<32> *result, ap_int<1> *ticket)

Full header consist of 80 bytes of data including nonce. To avoid messing with memory implementation and addressing, we took the easiest step to pass header data from Verilog to C which is to pass 10 8-byte long numbers followed by significant DWORDs of target, start nonce and result signals.

The logic inside the groestl function is following: it loops nonce starting from startnonce value all the way to 0xFFFFFF. It calculates double Gr0estl from input message and compares significant target values to hash values and fires ticket signal if current hash satisfies POW requested via target field. Once ticket is fired you can pass it back to the miner software or send it to the pool server in any way you like, that is you successful share and it should be accepted by the pool server.

Below is the Verilog binding of the groesl module generated via Vivado HLS. As simple as it sounds, just hook it up, give some data to chew on and sit back waiting for the result.

groestl groestl(
           .ap_clk(clk_h),
           .ap_rst(reset_break),
           .ap_start(start_hash),
           .ap_done(),
           .ap_idle(),
           .ap_ready(),
           .data0({ws0[31:0], ws0[63:32]}), // data dword0
           .data1({ws1[31:0], ws1[63:32]}),
           .data2({ws2[31:0], ws2[63:32]}),
           .data3({ws3[31:0], ws3[63:32]}),
           .data4({ws4[31:0], ws4[63:32]}),
           .data5({ws5[31:0], ws5[63:32]}),
           .data6({ws6[31:0], ws6[63:32]}),
           .data7({ws7[31:0], ws7[63:32]}),
           .data8({ws8[31:0], ws8[63:32]}),
           .data9({ws9[31:0], ws9[63:32]}),
           .target7(target7), // dword7 from target
           .target6(target6), // dword6 from target
           .startnonce(32'h00000000),
           .ticket_V_ap_vld(hashFound),
           .result_V(nonce)
   );

In theory I would want to connect ap_done signal as well to know if the module finished chewing through all available nonces and ready for the new data, but in the real world our computation speed is too low so we will more likely get another job from the pool before we can finish the whole range of available nonces.

Please feel free to donate if you like our project! BTC: 3LwsJAzPd8weD1FypVWmkDFMwA7rgjPSif