Ramhash bug #378

Open
opened 3 months ago by duke · 2 comments
duke commented 3 months ago
Owner

TLDR: Ramhash code on the ramhash branch coredumps when the mining thread starts and it's not clear why.

The command

./src/hushd -debug=ramhash -ac_algo=ramhash -ac_name=RAMHASH -ac_private=1 -ac_blocktime=20 -ac_reward=500000000 -ac_supply=55555 -gen=1 -genproclimit=1 -testnode=1

instantly coredumps after startup, when it attempts to execute RamhashMiner from src/miner.cpp . If -gen=1 -genproclimit=1 are omitted, the node starts up fine but then coredumps if hush-cli -ac_name=RAMHASH setgenerate true is executed.

Output from

gdb src/hushd core

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055c9483f01c0 in RamhashMiner (pwallet=0x55c94abb62f0) at miner.cpp:1504
1504    {
[Current thread is 1 (Thread 0x7f26e1ffb700 (LWP 2722869))]
(gdb) bt
#0  0x000055c9483f01c0 in RamhashMiner (pwallet=0x55c94abb62f0) at miner.cpp:1504
#1  0x000055c94888a02b in thread_proxy ()
#2  0x00007f2717a2f609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#3  0x00007f2717954293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) info registers
rax            0x55c9483f01a0      94322988876192
rbx            0x0                 0
rcx            0x3                 3
rdx            0x1                 1
rsi            0x55c94ad793a0      94323032429472
rdi            0x55c94abb62f0      94323030582000
rbp            0x7f26e1ffadb0      0x7f26e1ffadb0
rsp            0x7f26e17fbd88      0x7f26e17fbd88
r8             0x55c94abb62f0      94323030582000
r9             0x7f26e1ffb700      139804977116928
r10            0x7f26e1ffb9d0      139804977117648
r11            0x7f26a1ffad88      139803903372680
r12            0x55c94ad794f0      94323032429808
r13            0x55c94ad793a0      94323032429472
r14            0x7ffff1142360      140737238016864
r15            0x7f26e1ffaec0      139804977114816
rip            0x55c9483f01c0      0x55c9483f01c0 <RamhashMiner(CWallet*)+32>
eflags         0x10206             [ PF IF RF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0
(gdb) info frame
Stack level 0, frame at 0x7f26e1ffadc0:
 rip = 0x55c9483f01c0 in RamhashMiner (miner.cpp:1504); saved rip = 0x55c94888a02b
 called by frame at 0x7f26e1ffae00
 source language c++.
 Arglist at 0x7f26e17fbd80, args: pwallet=0x55c94abb62f0
 Locals at 0x7f26e17fbd80, Previous frame's sp is 0x7f26e1ffadc0
 Saved registers:
  rbp at 0x7f26e1ffadb0, rip at 0x7f26e1ffadb8
(gdb) info locals
chainparams = <optimized out>
reservekey = <error reading variable reservekey (Cannot access memory at address 0x7f26a1ffab60)>
nExtraNonce = <error reading variable nExtraNonce (Cannot access memory at address 0x7f26a1ffa9fc)>
script = <optimized out>
total = <optimized out>
i = <optimized out>
j = <optimized out>
m_cs = <error reading variable m_cs (Cannot access memory at address 0x7f26a1ffaaf0)>
cancelSolver = <error reading variable cancelSolver (Cannot access memory at address 0x7f26a1ffa9fb)>
c = <error reading variable c (Cannot access memory at address 0x7f26a1ffaa10)>
ramhash = <error reading variable ramhash (value of type `LXRHash' requires 1073741832 bytes, which is more than max-value-size)>
__func__ = "RamhashMiner"
(gdb) p $sp
$1 = (void *) 0x7f26e17fbd88
(gdb) x $sp
0x7f26e17fbd88: Cannot access memory at address 0x7f26e17fbd88

Notice that the coredump occurs on a line with a single { , which is a hint. I think the bug is related to linking against an .o file as a library. So I believe the error is in the build system where currently src/Ramhash/obj/lxrhash.o is being given to ld as a static library when it should be given to g++ as an object file. The weird thing is that it even compiles and links without error.

TLDR: Ramhash code on the `ramhash` branch coredumps when the mining thread starts and it's not clear why. The command ``` ./src/hushd -debug=ramhash -ac_algo=ramhash -ac_name=RAMHASH -ac_private=1 -ac_blocktime=20 -ac_reward=500000000 -ac_supply=55555 -gen=1 -genproclimit=1 -testnode=1 ``` instantly coredumps after startup, when it attempts to execute `RamhashMiner` from src/miner.cpp . If `-gen=1 -genproclimit=1` are omitted, the node starts up fine but then coredumps if `hush-cli -ac_name=RAMHASH setgenerate true` is executed. Output from ```gdb src/hushd core``` ``` Program terminated with signal SIGSEGV, Segmentation fault. #0 0x000055c9483f01c0 in RamhashMiner (pwallet=0x55c94abb62f0) at miner.cpp:1504 1504 { [Current thread is 1 (Thread 0x7f26e1ffb700 (LWP 2722869))] (gdb) bt #0 0x000055c9483f01c0 in RamhashMiner (pwallet=0x55c94abb62f0) at miner.cpp:1504 #1 0x000055c94888a02b in thread_proxy () #2 0x00007f2717a2f609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #3 0x00007f2717954293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (gdb) info registers rax 0x55c9483f01a0 94322988876192 rbx 0x0 0 rcx 0x3 3 rdx 0x1 1 rsi 0x55c94ad793a0 94323032429472 rdi 0x55c94abb62f0 94323030582000 rbp 0x7f26e1ffadb0 0x7f26e1ffadb0 rsp 0x7f26e17fbd88 0x7f26e17fbd88 r8 0x55c94abb62f0 94323030582000 r9 0x7f26e1ffb700 139804977116928 r10 0x7f26e1ffb9d0 139804977117648 r11 0x7f26a1ffad88 139803903372680 r12 0x55c94ad794f0 94323032429808 r13 0x55c94ad793a0 94323032429472 r14 0x7ffff1142360 140737238016864 r15 0x7f26e1ffaec0 139804977114816 rip 0x55c9483f01c0 0x55c9483f01c0 <RamhashMiner(CWallet*)+32> eflags 0x10206 [ PF IF RF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 (gdb) info frame Stack level 0, frame at 0x7f26e1ffadc0: rip = 0x55c9483f01c0 in RamhashMiner (miner.cpp:1504); saved rip = 0x55c94888a02b called by frame at 0x7f26e1ffae00 source language c++. Arglist at 0x7f26e17fbd80, args: pwallet=0x55c94abb62f0 Locals at 0x7f26e17fbd80, Previous frame's sp is 0x7f26e1ffadc0 Saved registers: rbp at 0x7f26e1ffadb0, rip at 0x7f26e1ffadb8 (gdb) info locals chainparams = <optimized out> reservekey = <error reading variable reservekey (Cannot access memory at address 0x7f26a1ffab60)> nExtraNonce = <error reading variable nExtraNonce (Cannot access memory at address 0x7f26a1ffa9fc)> script = <optimized out> total = <optimized out> i = <optimized out> j = <optimized out> m_cs = <error reading variable m_cs (Cannot access memory at address 0x7f26a1ffaaf0)> cancelSolver = <error reading variable cancelSolver (Cannot access memory at address 0x7f26a1ffa9fb)> c = <error reading variable c (Cannot access memory at address 0x7f26a1ffaa10)> ramhash = <error reading variable ramhash (value of type `LXRHash' requires 1073741832 bytes, which is more than max-value-size)> __func__ = "RamhashMiner" (gdb) p $sp $1 = (void *) 0x7f26e17fbd88 (gdb) x $sp 0x7f26e17fbd88: Cannot access memory at address 0x7f26e17fbd88 ``` Notice that the coredump occurs on a line with a single `{` , which is a hint. I think the bug is related to linking against an `.o` file as a library. So I believe the error is in the build system where currently `src/Ramhash/obj/lxrhash.o` is being given to `ld` as a static library when it should be given to `g++` as an object file. The weird thing is that it even compiles and links without error.
duke added the
bug
label 3 months ago
Poster
Owner

Currently in our build system every "library" we link against is a ".a" file while lxrhash.o is an ".o" file. As a test I generated a ".a" file via

ar rcs liblxrhash.a lxrhash.o

and changed the build system to link against that instead :

--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -62,7 +62,7 @@ LIBUNIVALUE=univalue/libunivalue.la
 LIBZCASH=libzcash.a
 LIBHUSH=libhush.a
 LIBRANDOMX=RandomX/build/librandomx.a
-LIBRAMHASH=Ramhash/obj/lxrhash.o
+LIBRAMHASH=Ramhash/obj/liblxrhash.a

Even with those changes, I still get exactly the same coredump+backtrace with an invalid stack pointer:

#0  0x00005644459ae750 in RamhashMiner (pwallet=0x564448e139e0) at miner.cpp:1504
1504    {
[Current thread is 1 (Thread 0x7fe6aaffd700 (LWP 2740119))]
(gdb) bt
#0  0x00005644459ae750 in RamhashMiner (pwallet=0x564448e139e0) at miner.cpp:1504
#1  0x0000564445e4902b in thread_proxy ()
#2  0x00007fe6db2b9609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#3  0x00007fe6db1de293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) x $sp
0x7fe6aa7fdd88: Cannot access memory at address 0x7fe6aa7fdd88

Why is the stack pointer invalid?

Currently in our build system every "library" we link against is a ".a" file while lxrhash.o is an ".o" file. As a test I generated a ".a" file via ``` ar rcs liblxrhash.a lxrhash.o ``` and changed the build system to link against that instead : ``` --- a/src/Makefile.am +++ b/src/Makefile.am @@ -62,7 +62,7 @@ LIBUNIVALUE=univalue/libunivalue.la LIBZCASH=libzcash.a LIBHUSH=libhush.a LIBRANDOMX=RandomX/build/librandomx.a -LIBRAMHASH=Ramhash/obj/lxrhash.o +LIBRAMHASH=Ramhash/obj/liblxrhash.a ``` Even with those changes, I still get exactly the same coredump+backtrace with an invalid stack pointer: ``` #0 0x00005644459ae750 in RamhashMiner (pwallet=0x564448e139e0) at miner.cpp:1504 1504 { [Current thread is 1 (Thread 0x7fe6aaffd700 (LWP 2740119))] (gdb) bt #0 0x00005644459ae750 in RamhashMiner (pwallet=0x564448e139e0) at miner.cpp:1504 #1 0x0000564445e4902b in thread_proxy () #2 0x00007fe6db2b9609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #3 0x00007fe6db1de293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (gdb) x $sp 0x7fe6aa7fdd88: Cannot access memory at address 0x7fe6aa7fdd88 ``` Why is the stack pointer invalid?
Poster
Owner

Instead of linking a .o or .a file, I added src/Ramhash/src/lxrhash.cpp directly as a source file to our build system, so it generates ./src/Ramhash/src/libbitcoin_common_a-lxrhash.o and compiles that with g++ instead of linking with ld.

Still get exactly the same coredump + backtrace.

Instead of linking a .o or .a file, I added `src/Ramhash/src/lxrhash.cpp` directly as a source file to our build system, so it generates `./src/Ramhash/src/libbitcoin_common_a-lxrhash.o` and compiles that with g++ instead of linking with ld. Still get exactly the same coredump + backtrace.
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.