Support multiple lite servers #1

Open
opened 1 year ago by duke · 15 comments
duke commented 1 year ago
Owner

Currently the code only supports a single hardcoded lightwalletd URL of lite2.hushpool.is . We want to have the same feature of SDL which randomly chooses a server from a list and also will try other random servers if that server happens to be down.

To implement this, SDL chooses a random index into the array of servers, and then if that server is down, increments the index by one modulus the size of the server list. It will repeat this process up to N times, where N is the size of the server list. This ensures that every single server in the list will be tried once. If it instead randomly chose an index every time, some servers might not be tried.

This feature can be broken down into 2 parts:

  • Randomly choose an initial server (easier)
  • Randomly choose a different server if initial server is down (harder)

The first part offers load balancing, while the second part offers resilience to down servers. With both implemented the app will continue to work as long a single lite server is functional.

Currently the code only supports a single hardcoded lightwalletd URL of `lite2.hushpool.is` . We want to have the same feature of SDL which randomly chooses a server from a list and also will try other random servers if that server happens to be down. To implement this, SDL chooses a random index into the array of servers, and then if that server is down, increments the index by one modulus the size of the server list. It will repeat this process up to N times, where N is the size of the server list. This ensures that every single server in the list will be tried once. If it instead randomly chose an index every time, some servers might not be tried. This feature can be broken down into 2 parts: - [x] Randomly choose an initial server (easier) - [ ] Randomly choose a different server if initial server is down (harder) The first part offers load balancing, while the second part offers resilience to down servers. With both implemented the app will continue to work as long a single lite server is functional.
fekt commented 1 year ago
Collaborator

Do we have a list of lightwalletd servers that have updated and can be used for this? I just need a list and think I can set randomly here instead of using DEFAULT_SERVER_URL:
https://git.hush.is/hush/SilentDragonAndroid/src/branch/main/app/src/main/java/cash/z/ecc/android/ext/Const.kt#L46

Do we have a list of lightwalletd servers that have updated and can be used for this? I just need a list and think I can set randomly here instead of using DEFAULT_SERVER_URL: https://git.hush.is/hush/SilentDragonAndroid/src/branch/main/app/src/main/java/cash/z/ecc/android/ext/Const.kt#L46
duke commented 1 year ago
Poster
Owner

@fekt

wtfistheinternet.hush.is
lite.hush.is (same as lite.hush.land)
lite2.hush.is
poop.granitephone.me
lite.hushpool.is
lite2.hushpool.is

@fekt wtfistheinternet.hush.is lite.hush.is (same as lite.hush.land) lite2.hush.is poop.granitephone.me lite.hushpool.is lite2.hushpool.is
fekt commented 1 year ago
Collaborator

Thanks. Bigger list the better. The code I added seems to work and I'm getting different servers showing under the settings when closing the app and re-opening. There is likely some funkiness if manually changing or resetting to default that will need to be looked into. Manually changing the server will save to encrypted prefs and always use that server. I have not checked what resetting does exactly.

Thanks. Bigger list the better. The code I added seems to work and I'm getting different servers showing under the settings when closing the app and re-opening. There is likely some funkiness if manually changing or resetting to default that will need to be looked into. Manually changing the server will save to encrypted prefs and always use that server. I have not checked what resetting does exactly.
fekt commented 1 year ago
Collaborator

poop.granitephone.me seems down currently and app crashed when it tried to use it.

Looks like it should be poop.granitefone.me. autocorrect prob got ya lol.
lite2.hush.is did not work.

~~poop.granitephone.me seems down currently and app crashed when it tried to use it.~~ Looks like it should be poop.granitefone.me. autocorrect prob got ya lol. lite2.hush.is did not work.
duke commented 1 year ago
Poster
Owner

@fekt I mistyped, yeah it's poop.granitefone.me . looks like lite2.hush.is resolves but doesn't have an active server

@fekt I mistyped, yeah it's poop.granitefone.me . looks like lite2.hush.is resolves but doesn't have an active server
duke commented 1 year ago
Poster
Owner

@fekt the code to randomly choose a server and keep trying new servers until it finds one that is up is here in SDL: https://git.hush.is/hush/SilentDragonLite/src/branch/master/src/settings.cpp#L300

We are going to need code like that in SDA because lite servers go down and up all the time and currently 1.0.1 SDA shows a popup error if a server is down, then crashes.

@fekt the code to randomly choose a server and keep trying new servers until it finds one that is up is here in SDL: https://git.hush.is/hush/SilentDragonLite/src/branch/master/src/settings.cpp#L300 We are going to need code like that in SDA because lite servers go down and up all the time and currently 1.0.1 SDA shows a popup error if a server is down, then crashes.
duke commented 1 year ago
Poster
Owner

@fekt to explain how the code works: It creates a "tries" variable which is the number of entries in the servers array. This will be the maximum number of tries to find a server that is up. It then generates a random integer between 0 and the length of the array, to get a random index into the array. It then tries to connect, if the server is not up, it increments the index by 1, and tries again. It continues that process until it finds a server that is up or it runs out of tries.

The above algorithm is better than just picking a random integer in a loop because sometimes you will pick the same random index (we have a small array so it's common) and you won't know when to stop the loop.

@fekt to explain how the code works: It creates a "tries" variable which is the number of entries in the servers array. This will be the maximum number of tries to find a server that is up. It then generates a random integer between 0 and the length of the array, to get a random index into the array. It then tries to connect, if the server is not up, it increments the index by 1, and tries again. It continues that process until it finds a server that is up or it runs out of tries. The above algorithm is better than just picking a random integer in a loop because sometimes you will pick the same random index (we have a small array so it's common) and you won't know when to stop the loop.
fekt commented 1 year ago
Collaborator

I am not sure on the best way to implement trying another server on failure yet. From what I can see debugging, it will try 6 times before failing and then throw a generic "Unrecoverable Error". The lower level calls for things like getlightdinfo are all in the SDK. I would probably need to try and catch specific error messages, but I've seen two different errors for bad servers so far.

I am not sure on the best way to implement trying another server on failure yet. From what I can see debugging, it will try 6 times before failing and then throw a generic "Unrecoverable Error". The lower level calls for things like getlightdinfo are all in the SDK. I would probably need to try and catch specific error messages, but I've seen two different errors for bad servers so far.
duke commented 1 year ago
Poster
Owner

@fekt it sounds like we might need to modify our SDK to implement this. Can you point me to the code in SDA + the SDK that does this stuff? Maybe I can help

@fekt it sounds like we might need to modify our SDK to implement this. Can you point me to the code in SDA + the SDK that does this stuff? Maybe I can help
fekt commented 1 year ago
Collaborator

@duke I don't really know or understand a lot of the code yet. Modifying in the SDK would be one way to do it instead of having the app initialize the connection but it'd probably be kind of a hack and maybe not ideal for an SDK. I think on startup it tries calling this initially:
https://git.hush.is/fekt/hush-android-wallet-sdk/src/branch/main/sdk-lib/src/main/java/cash/z/ecc/android/sdk/internal/block/CompactBlockDownloader.kt#L78

In the app, you'll only get this error handling if it fails after 6 tries. I added an additional dialog here to catch "UNAVAILABLE":
c9ac523935/app/src/main/java/cash/z/ecc/android/ext/Dialogs.kt (L94)

It is also didn't seem possible to restart the synchronizer in the app after it's already been started. It gives any error stating so, but maybe needs to be destroyed first. Any changes to the network or server being used after startup I am not sure are possible without unknown modifications. I am not certain, but I don't think changing the server in settings actually uses that server until restarting the app but would need to test. I wasn't seeing anything change the server when saving anyway, but may have overlooked something.

I did see this in the SDK for testing server changes, but it's only a test:
https://git.hush.is/fekt/hush-android-wallet-sdk/src/branch/main/sdk-lib/src/androidTest/java/cash/z/ecc/android/sdk/integration/service/ChangeServiceTest.kt

I think it could possibly be done in the app, but checking server reachability would likely need to be done before the synchronizer is started. Some of this code is hidden in functions that call other functions in order to see what's actually going on. showCricticalMessage is actually called here within onCriticalError:
https://git.hush.is/fekt/hush-android-wallet/src/branch/main/app/src/main/java/cash/z/ecc/android/ui/MainActivity.kt#L260

And that is called here:
https://git.hush.is/fekt/hush-android-wallet/src/branch/main/app/src/main/java/cash/z/ecc/android/ui/MainActivity.kt#L229

@duke I don't really know or understand a lot of the code yet. Modifying in the SDK would be one way to do it instead of having the app initialize the connection but it'd probably be kind of a hack and maybe not ideal for an SDK. I think on startup it tries calling this initially: https://git.hush.is/fekt/hush-android-wallet-sdk/src/branch/main/sdk-lib/src/main/java/cash/z/ecc/android/sdk/internal/block/CompactBlockDownloader.kt#L78 In the app, you'll only get this error handling if it fails after 6 tries. I added an additional dialog here to catch "UNAVAILABLE": https://git.hush.is/fekt/hush-android-wallet/src/commit/c9ac5239350213508b452ce2e95006cac887e726/app/src/main/java/cash/z/ecc/android/ext/Dialogs.kt#L94 It is also didn't seem possible to restart the synchronizer in the app after it's already been started. It gives any error stating so, but maybe needs to be destroyed first. Any changes to the network or server being used after startup I am not sure are possible without unknown modifications. I am not certain, but I don't think changing the server in settings actually uses that server until restarting the app but would need to test. I wasn't seeing anything change the server when saving anyway, but may have overlooked something. I did see this in the SDK for testing server changes, but it's only a test: https://git.hush.is/fekt/hush-android-wallet-sdk/src/branch/main/sdk-lib/src/androidTest/java/cash/z/ecc/android/sdk/integration/service/ChangeServiceTest.kt I think it could possibly be done in the app, but checking server reachability would likely need to be done before the synchronizer is started. Some of this code is hidden in functions that call other functions in order to see what's actually going on. showCricticalMessage is actually called here within onCriticalError: https://git.hush.is/fekt/hush-android-wallet/src/branch/main/app/src/main/java/cash/z/ecc/android/ui/MainActivity.kt#L260 And that is called here: https://git.hush.is/fekt/hush-android-wallet/src/branch/main/app/src/main/java/cash/z/ecc/android/ui/MainActivity.kt#L229
duke commented 1 year ago
Poster
Owner

@fekt thanks for explaining with those links, it helps me understand how the SDK+app coordinate.

The file https://git.hush.is/fekt/hush-android-wallet-sdk/src/branch/main/sdk-lib/src/androidTest/java/cash/z/ecc/android/sdk/integration/service/ChangeServiceTest.kt is very interesting

It seems we need a function in the SDK that calls something like LightWalletGrpcService.new(context, LightWalletEndpoint("invalid.lightwalletd", 9087, true)) in a loop, catching exceptions from down servers, until it gets a valid server, then starts the synchronizer/downloader on that server

"I think it could possibly be done in the app, but checking server reachability would likely need to be done before the synchronizer is started. " Yes, I agree. In SDL, C++ code getRandomServer() is basically a wrapper around Rust code that does the low-level server connection stuff. With SDA, it will be something similar, possibly a function like getRandomServer() in Kotlin that calls SDK code like LightWalletGrpcService.new() and we may need to allow some things public in the SDK that are currently private.

As for "I am not certain, but I don't think changing the server in settings actually uses that server until restarting the app but would need to test", I think we might want to consider that a bug, because most users will not know to restart the app. There should be a way to make a new grpc connection when the users hit the "Save" option in the GUI.

I agree with your statement "Modifying in the SDK would be one way to do it instead of having the app initialize the connection but it'd probably be kind of a hack and maybe not ideal for an SDK." and my idea is to modify the SDK as little as possible such that the app can have code that does what we want. This will likely mean making some currently private SDK functions/methods public and callable via the app. It seems that some functionality we need to implement what we want is buried in SDK internals and not available to the app.

@fekt thanks for explaining with those links, it helps me understand how the SDK+app coordinate. The file https://git.hush.is/fekt/hush-android-wallet-sdk/src/branch/main/sdk-lib/src/androidTest/java/cash/z/ecc/android/sdk/integration/service/ChangeServiceTest.kt is very interesting It seems we need a function in the SDK that calls something like `LightWalletGrpcService.new(context, LightWalletEndpoint("invalid.lightwalletd", 9087, true))` in a loop, catching exceptions from down servers, until it gets a valid server, then starts the synchronizer/downloader on that server "I think it could possibly be done in the app, but checking server reachability would likely need to be done before the synchronizer is started. " Yes, I agree. In SDL, C++ code getRandomServer() is basically a wrapper around Rust code that does the low-level server connection stuff. With SDA, it will be something similar, possibly a function like getRandomServer() in Kotlin that calls SDK code like `LightWalletGrpcService.new()` and we may need to allow some things public in the SDK that are currently private. As for "I am not certain, but I don't think changing the server in settings actually uses that server until restarting the app but would need to test", I think we might want to consider that a bug, because most users will not know to restart the app. There should be a way to make a new grpc connection when the users hit the "Save" option in the GUI. I agree with your statement "Modifying in the SDK would be one way to do it instead of having the app initialize the connection but it'd probably be kind of a hack and maybe not ideal for an SDK." and my idea is to modify the SDK as little as possible such that the app can have code that does what we want. This will likely mean making some currently private SDK functions/methods public and callable via the app. It seems that some functionality we need to implement what we want is buried in SDK internals and not available to the app.
fekt commented 1 year ago
Collaborator

@duke I tested and confirmed that changing the server in settings does not actually use that server until a restart. Monitoring on original server, I see all requests still go there. Zcash took that feature out of Nighthawk. This is issue and PR when they added:
https://github.com/zcash/zcash-android-wallet/issues/182
https://github.com/zcash/zcash-android-wallet/pull/205

I could try looking at and testing Nighthawk to see if that does the same. They mentioned Nighthawk restarting main activity, which would probably cause the server to be used. It may have been an old version of Nighthawk though because last I looked at it they updated everything to mostly use the same code we are.

I was thinking of using a function like this to check availability in the app before the initializer or synchronizer starts but haven't really looked into where exactly it would need to be implemented. It could be modified to keep changing/checking a server until it finds a working one and then use that server when the SDK is initialized.

private suspend fun checkServer() {
    withContext(Dispatchers.IO) {
        var server: InetAddress = InetAddress.getByName("https://lite.hush.is")
        val timeout = 3000
        if (server.isReachable(timeout)){
            // available
        }else{
            // unavailable
        }
    }
}

I think it could potenitally be done here when loading the config. Would already have the initial random server to check availibility, find a server that's up, and then use that for the config. Might be easier said than done though.
https://git.hush.is/fekt/hush-android-wallet/src/branch/main/app/src/main/java/cash/z/ecc/android/ui/setup/WalletSetupViewModel.kt#L93

@duke I tested and confirmed that changing the server in settings does not actually use that server until a restart. Monitoring on original server, I see all requests still go there. Zcash took that feature out of Nighthawk. This is issue and PR when they added: https://github.com/zcash/zcash-android-wallet/issues/182 https://github.com/zcash/zcash-android-wallet/pull/205 I could try looking at and testing Nighthawk to see if that does the same. They mentioned Nighthawk restarting main activity, which would probably cause the server to be used. It may have been an old version of Nighthawk though because last I looked at it they updated everything to mostly use the same code we are. I was thinking of using a function like this to check availability in the app before the initializer or synchronizer starts but haven't really looked into where exactly it would need to be implemented. It could be modified to keep changing/checking a server until it finds a working one and then use that server when the SDK is initialized. ``` private suspend fun checkServer() { withContext(Dispatchers.IO) { var server: InetAddress = InetAddress.getByName("https://lite.hush.is") val timeout = 3000 if (server.isReachable(timeout)){ // available }else{ // unavailable } } } ``` I think it could potenitally be done here when loading the config. Would already have the initial random server to check availibility, find a server that's up, and then use that for the config. Might be easier said than done though. https://git.hush.is/fekt/hush-android-wallet/src/branch/main/app/src/main/java/cash/z/ecc/android/ui/setup/WalletSetupViewModel.kt#L93
Collaborator

@fekt

wtfistheinternet.hush.is
lite.hush.is (same as lite.hush.land)
lite2.hush.is
poop.granitephone.me
lite.hushpool.is
lite2.hushpool.is

@fekt This is the current most updated list that I believe will be relevant in the future:

lite.hush.is
lite.hush.land
lite.hush.community
wtfistheinternet.hush.is
lite.myhush.org
poop.granitefone.me
lite.hushpool.is
lite2.hushpool.is

I took it from the recently updated mainwindow.cpp#L853 file.

So lite2.hush.is was removed and two more servers were added: lite.myhush.org and lite.hush.community.

> @fekt > > wtfistheinternet.hush.is > lite.hush.is (same as lite.hush.land) > lite2.hush.is > poop.granitephone.me > lite.hushpool.is > lite2.hushpool.is > > @fekt This is the current most updated list that I believe will be relevant in the future: ``` lite.hush.is lite.hush.land lite.hush.community wtfistheinternet.hush.is lite.myhush.org poop.granitefone.me lite.hushpool.is lite2.hushpool.is ``` I took it from the recently updated [mainwindow.cpp#L853](https://git.hush.is/hush/SilentDragonLite/src/branch/dev/src/mainwindow.cpp#L853) file. So `lite2.hush.is` was removed and two more servers were added: `lite.myhush.org` and `lite.hush.community`.
fekt commented 1 year ago
Collaborator

@onryo Is that list still accurate? I didn't see this until now but updated based on servers listed for latest SDL release the other day. It seemed to have problems connecting to lite.hush.communty and lite.myhush.org. I probably removed a couple as well, but will add back if so.

@onryo Is that list still accurate? I didn't see this until now but updated based on servers listed for latest SDL release the other day. It seemed to have problems connecting to lite.hush.communty and lite.myhush.org. I probably removed a couple as well, but will add back if so.
Collaborator

@onryo Is that list still accurate? I didn't see this until now but updated based on servers listed for latest SDL release the other day. It seemed to have problems connecting to lite.hush.communty and lite.myhush.org. I probably removed a couple as well, but will add back if so.

Yes, it is the final list, all 8 servers are up and running the latest lightwalletd.

> @onryo Is that list still accurate? I didn't see this until now but updated based on servers listed for latest SDL release the other day. It seemed to have problems connecting to lite.hush.communty and lite.myhush.org. I probably removed a couple as well, but will add back if so. Yes, it is the final list, all 8 servers are up and running the latest lightwalletd.
Sign in to join this conversation.
No Label
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.