Valhalla Legends Forums Archive | Battle.net Bot Development | Open BotNet Spec 1.0

AuthorMessageTime
St0rm.iD
I'd appreciate constructive comments on this protocol.

http://pointless.servemp3.com/botnetspec.html
April 30, 2003, 2:51 AM
Arta
1. How does client A know the addresses of clients B, C, D... etc, without a central server?

2. Even if you get round this problem, it's eew to have to connect each client to every other client. Imagine if there are 200 users online? or 2000? This means that each client has to send 200 (or 2000) identical messages instead of having to send one to the server. This is not practical. What happens if you have a crappy client as one node? Other nodes are then not going to have correct information about what's transpiring on the OBN.

3. How do you prevent abuse? This totally open network would have no reliable means to ban or restrict the access of abusive users.

4. Last, but not least, it's *very* exploitable. Imagine client A, a malicious user, adds an entry on client B. Client B then sends a 'register' message to all the other clients. Now imagine that client A floods client B with additions. If there's 200 users on the OBN, you've suddendly multiplied the amount of data traversing client B's connection by a factor of 200 (at least). Dos-in-a-box.
April 30, 2003, 4:07 AM
tA-Kane
You most certainly need a way to communicate the communication version, to ensure that new nodes are compatible with old, and vice versa.

How will you combat database corruption by hostile nodes?

I am not very fond of peer-to-peer networks such as this, because of how bandwidth intensive they are; The Gnutella network being a prime example; just 4 "idle" connections will leech the fuck out of my DSL line. Granted, there's thousands more clients (as such, an equal increase in messages transmitted), but the possibility still exists.

One way to cut down on traffic is to have "designated hubs" which have receiving clients (not receiving hubs) transmit their whole database to the hub, and transmit changes (adds, changes, deletes, etc) to the hub when appropriate. Then, when the designated hub receives a query, it searches the cached databases instead of forwarding it on to the client, and if matches are found in the cache, it forwards the reply as if the reply had origionated from the matched database's owner, and forwards a "query/match found for your database" to the client, to let it know that a match was found on its database.

[quote]2.1 Database exchange
This message is sent to a connecting node once. It is not routed.[/quote]What do you mean by "It is not routed."? Are you meaning that the client receiving this exchange does not transmit it to any other clients connected to it? If so, then that's very bad... if two networks "merge", then the both databases most surely need to be transmitted. Additionally, it would be wise, when merging networks, to have the highest-versioned "hubs" of those networks to be the link, so that the opposing databases aren't limited by an older protocol.

Just my two cents on what you have, so far.
April 30, 2003, 4:09 AM
tA-Kane
[quote author=Arta[vL] link=board=17;threadid=1190;start=0#msg8826 date=1051675632]2. Even if you get round this problem, it's eew to have to connect each client to every other client.

4. Dos-in-a-box.[/quote]Designated hubs would cut down on those two factors.
April 30, 2003, 4:11 AM
Skywing
[quote author=tA-Kane link=board=17;threadid=1190;start=0#msg8828 date=1051675882]
[quote author=Arta[vL] link=board=17;threadid=1190;start=0#msg8826 date=1051675632]2. Even if you get round this problem, it's eew to have to connect each client to every other client.

4. Dos-in-a-box.[/quote]Designated hubs would cut down on those two factors.
[/quote]Then you defeat the whole idea of decentralization.
April 30, 2003, 5:06 PM
tA-Kane
[quote author=Skywing link=board=17;threadid=1190;start=0#msg8847 date=1051722416]
[quote author=tA-Kane link=board=17;threadid=1190;start=0#msg8828 date=1051675882]Designated hubs would cut down on those two factors.
[/quote]Then you defeat the whole idea of decentralization.
[/quote]Not really. If the open BotNet spec allows for other clients' IP address transmission (which the current open spec does not specify, but I assumed it would via a custom message), then it would not be hard to get a list of alternate designated hubs (or even connect to designated hubs through other users, if that user's client allows for such a thing) so you can remain connected to the network when it goes down.
April 30, 2003, 6:10 PM
St0rm.iD
I'm sorry I wasn't clear enough in the spec on how to prevent floods and such.

We'll assume there are more compliant clients than clients that try to disrupt the system. For each register message, a given public key can only entered in the database once. Message throttling will also be implemented. Since most good clients will comply with this, they will refuse such messages.

About finding IP addresses Arta, I was envisioning either a profile key or a special whisper command to get the IP addresses.

Arta: not every client needs to be connected to each other. The messages propagate themselves via others, so I can be connected to another node which is connected to two other nodes. I have one connection, and when I send the message, I will reach three other nodes.

Connection hubs would be implemented if necessary, however I believe they may automatically evolve, i.e. the people in vL would all connect to [vL], and the fe people would connect to Fatal-Error. Connecting these two together would bring the network together.

Kane: thanks for pointing that out about database exchange. I didn't consider that. I see three possible solutions:

1) forward to everyone on the network (expensive)
2) don't allow two networks to be connected (perhaps make this version 0)
3) central database (then why would it be p2p?)

The problem of authentication database corruption is an interesting one. I was assuming the good clients would be able to weed it out, but perhaps a malicious client would be able to get high enough in the chain and spoil it. Perhaps some sort of digital signature is required, however, I cannot figure out how a central authority would be determined.

About bandwith, I believe UDP and hubs could be used to limit it. Instead of broadcasting responses, replies could just be sent via UDP to the peer. Firewall and reliability issues would have to be addressed. Central hubs, as I discussed above, may or may not be implemented in software.

The biggest problem I see is with the authentication database. Any suggestions about how faking could be 100% eliminated?
April 30, 2003, 7:34 PM
St0rm.iD
Cuphead proposes: "Ok, suppose we have nodes A, B, and E. A and B are normal functioning nodes, E (the Evil hacker node) has a corrupted database. All are functioning independently of one another. Assume E requests a database merge with A. A saves its current database into a new file and merges E's database into its own. A's database is now corrupted with E's bad keys. Now B requests a connect/merge with A. A sees an entry that exists in its backup database that coincides with B's, but is modified in A's newly merged (with E) database. A restores its old database and merges that with B's, since chances are that B's database is intact."

To handle distribution of the public key database, perhaps we could build it like we build the information database (query/response)? This would allow propagation and users who have dead accounts wouldn't clog up the network. We'd simply use the public key that gets the most responses.

Please make suggestions! ;)
April 30, 2003, 8:04 PM
tA-Kane
Another problem I just thought about was what if two clients make a different database change at the same time?

1) Either the packet number would be the same, thus some clients would parse packet A and others parse packet B, causing a database desync

2) If the packet numbers aren't the same, then other clients would still have a possible database sync, because of one packet being transmitted one way, and the other being transmitted in the opposite direct. Take the following diagram...

A <-> B <-> C <-> D <-> E

Database change 1 (packet 1) gets sent from A
Database change 2 (packet 2) gets sent from E

Packet 1 would be transmitted to B, packet 2 gets to D. Then, B sends to C, and D sends to C. C receives either of them first... then forwards them both on. C's database is now the last one received, B's database is now E's, and D's database is now A's.

[quote]Messages are each assigned a unique routing ID.[/quote]Additionally, going back to message numbers ("Routing IDs"), this could be done with any combination of packet requests/replies... Using the same diagram as above, take the following example...

A sends a request to E. The request gets sent to B. B adds the message number to the "received message numbers list", and forwards on to C. While C receives the packet, E now transmits a request to B. E sends to D, C forwards to D. D, seeing E first, forwards E's request to C, and disregards what it received from C. C, receiving E's request with the same message number as A's request, disregards the packet.

Now both packets are disregarded. I suppose you could call this a packet number collision...
April 30, 2003, 8:10 PM
tA-Kane
[quote author=tA-Kane link=board=17;threadid=1190;start=0#msg8859 date=1051733420]Now both packets are disregarded. I suppose you could call this a packet number collision...[/quote]Now that I think about this, Routing IDs would be best created by using the source IP address and then a unique number. That way, a collision would only occur if the source-client uses the a number on one connection for one packet, then the same number on a different connection for a different packet, and then those two connections are on the same network.
April 30, 2003, 8:19 PM
Camel
[quote author=tA-Kane link=board=17;threadid=1190;start=0#msg8851 date=1051726228]
[quote author=Skywing link=board=17;threadid=1190;start=0#msg8847 date=1051722416]
[quote author=tA-Kane link=board=17;threadid=1190;start=0#msg8828 date=1051675882]Designated hubs would cut down on those two factors.
[/quote]Then you defeat the whole idea of decentralization.
[/quote]Not really. If the open BotNet spec allows for other clients' IP address transmission (which the current open spec does not specify, but I assumed it would via a custom message), then it would not be hard to get a list of alternate designated hubs (or even connect to designated hubs through other users, if that user's client allows for such a thing) so you can remain connected to the network when it goes down.
[/quote]
agreed. consider DNS: the internet is (basicly) decentralized. one could set up a whole bunch of hubs that are preset in to the client, and if even one connection is made that user would be able to discover new hubs. it's great in theory, but so is communism.
April 30, 2003, 8:33 PM
Skywing
[quote author=Camel link=board=17;threadid=1190;start=0#msg8864 date=1051734805]
[quote author=tA-Kane link=board=17;threadid=1190;start=0#msg8851 date=1051726228]
[quote author=Skywing link=board=17;threadid=1190;start=0#msg8847 date=1051722416]
[quote author=tA-Kane link=board=17;threadid=1190;start=0#msg8828 date=1051675882]Designated hubs would cut down on those two factors.
[/quote]Then you defeat the whole idea of decentralization.
[/quote]Not really. If the open BotNet spec allows for other clients' IP address transmission (which the current open spec does not specify, but I assumed it would via a custom message), then it would not be hard to get a list of alternate designated hubs (or even connect to designated hubs through other users, if that user's client allows for such a thing) so you can remain connected to the network when it goes down.
[/quote]
agreed. consider DNS: the internet is (basicly) decentralized. one could set up a whole bunch of hubs that are preset in to the client, and if even one connection is made that user would be able to discover new hubs. it's great in theory, but so is communism.
[/quote]
Err.. you do know what the DNS root servers are, don't you?
April 30, 2003, 10:10 PM
St0rm.iD
[quote author=tA-Kane link=board=17;threadid=1190;start=0#msg8859 date=1051733420]
Another problem I just thought about was what if two clients make a different database change at the same time?

1) Either the packet number would be the same, thus some clients would parse packet A and others parse packet B, causing a database desync

2) If the packet numbers aren't the same, then other clients would still have a possible database sync, because of one packet being transmitted one way, and the other being transmitted in the opposite direct. Take the following diagram...

A <-> B <-> C <-> D <-> E

Database change 1 (packet 1) gets sent from A
Database change 2 (packet 2) gets sent from E

Packet 1 would be transmitted to B, packet 2 gets to D. Then, B sends to C, and D sends to C. C receives either of them first... then forwards them both on. C's database is now the last one received, B's database is now E's, and D's database is now A's.

[quote]Messages are each assigned a unique routing ID.[/quote]Additionally, going back to message numbers ("Routing IDs"), this could be done with any combination of packet requests/replies... Using the same diagram as above, take the following example...

A sends a request to E. The request gets sent to B. B adds the message number to the "received message numbers list", and forwards on to C. While C receives the packet, E now transmits a request to B. E sends to D, C forwards to D. D, seeing E first, forwards E's request to C, and disregards what it received from C. C, receiving E's request with the same message number as A's request, disregards the packet.

Now both packets are disregarded. I suppose you could call this a packet number collision...
[/quote]

Well, I'm not sure if this is included in the spec, but replies should include a GMT timestamp, which should help with time collisions.

I don't know what you mean about routing IDs. They are unique (well, as unique as we can get) and shouldn't have a problem. They're only used so a packet doesn't get endlessly routed.
May 1, 2003, 12:13 AM
Arta
[quote]
Arta: not every client needs to be connected to each other. The messages propagate themselves via others, so I can be connected to another node which is connected to two other nodes. I have one connection, and when I send the message, I will reach three other nodes.
[/quote]

If Client A sends a message to client B, how does client B know if client C already has it? Say B forwards it to C, and then C unknowingly forwards it to D, but A & B have already sent the message to D? Even with some means to prevent clients from processing the same message twice, the messages are still *sent* - It would be very, very easy to create feedback loops in such a system. You could help prevent that by including some kind of TTL in each packet, but this introduces a new problem - that some clients might miss out on messages. Either way, you'll have a lot of redundent traffic floating around.

These kinds of distributed systems only work with some kind of central control mechanism. This has clearly been shown and demonstrated many times. Just look at similar distributed systems on the net - Kazaa and the like. All of them have a central controlling node. The lack of central control simply makes it too difficult to keep track of what's going on.

A better system would be to designate one of the nodes as a Master node, and another as a backup node. If the master node was going down intentionally, it could broadcast a reconnect message. If it suddenly disappeared, clients could automatically connect to the backup node, which would also have noticed that the master server died, and could thus designate itsself as the master node and set a new backup. Even this has it's problems, though..

Clients setting themselves as masters when they're not. Who's the master's master? Who has the definative say?
How does a new node know who the master is if the last master/backup they used have both gone?
What happens if a master goes down and some poor sod on dialup who's designated as the backup nodem suddenly gets ~200 clients connecting to it?

(that's just a sampling)

There are solutions to these problems too, but the whole idea is still fraught with difficulties. Ultimately, totally distributed systems like this only work when 2 (maybe more) conditions are met:

- All nodes are trustworthy
- There exists a separate means for nodes to find the other nodes, should they become isolated.

The second condition *might* be met here (Battle.net), but the first one *definitely* isn't, as has been demonstrated with the current Botnet on a number of occasions.
May 1, 2003, 2:04 AM
St0rm.iD
I think if *most* nodes are trustworthy, they won't participate with the malicious nodes and will filter their traffic.

Yes, D may receive the same message a few times, but a feedback loop won't occur because it won't process the same message twice.
May 1, 2003, 2:09 AM
Arta
[quote]
I think if *most* nodes are trustworthy, they won't participate with the malicious nodes and will filter their traffic.
[/quote]

That kind of assumption is the root cause of 99.99999999% of security problems in software. You can't assume *anything*.

[quote]
Yes, D may receive the same message a few times, but a feedback loop won't occur because it won't process the same message twice.
[/quote]

ok, that's a reasonable solution, but it still results in the generation of unneeded traffic.

I think there is a way to do this, but it involves a compromise. You need to select a number of dedicated nodes, running servers. These nodes should be interconnected, so that all nodes are aware of eachother, and no node going down affects the stability of the network. There could, in theory, but a large number of these nodes. They could dynamically be added, since all you'd need to do is add the address of the new node to the others. The clients would then connect to one node of their choice. Each dedicated node would have to remain in communication with the other dedicated nodes so that each one knows how to route whispers, chat, commands, updates, and so on. All the dedicated nodes would have to maintain copies of each database - or remember which database is stored on which node. You'd have to have 2 protocols, Node-to-Node, and Client-to-Node. If you haven't noticed already, this is (more or less) how Battle.net gateways work - sets of interconnected servers.

The only real difference between this and what kane suggests is that these nodes are dedicated, and are therefore better suited to efficiently handling such things as banned users, mutually exclusive access to databases, and so on.

Obviously, such a system would require vastly, vastly more effort to implement than it's worth.
May 1, 2003, 3:12 AM
Skywing
Suggestion: Looking at existing distributed networks, and how they work. You might start with how IP packets are routed across the Internet in the first place - other things to consider might be the peer to peer filesharing networks (perhaps Overnet).
May 1, 2003, 3:36 AM
Yoni
[quote author=Skywing link=board=17;threadid=1190;start=15#msg8907 date=1051760212]
Suggestion: Looking at existing distributed networks, and how they work. You might start with how IP packets are routed across the Internet in the first place - other things to consider might be the peer to peer filesharing networks (perhaps Overnet).
[/quote]Since Overnet and eDonkey are closed source, you might want to look at the great open source client for the eDonkey network, eMule.

[quote]- Convert from TCP to reliable UDP[/quote]
By the way, "reliable UDP" already exists, and it's called RDP (Reliable Datagram Protocol). Unfortunately, this protocol is not nearly as popular as TCP and UDP, and you probably have to use raw sockets to use it, which requires admin/root privileges so it's not that practical... :(

However, you may look at the specification of RDP, and take ideas from it (or implement it fully) over UDP (instead of over IP).
May 1, 2003, 2:23 PM
St0rm.iD
Ah, thanks Yoni, didn't know about RDP.

Also, I have done work with Limewire, a Gnutella filesharing client. This protocol is similar except it doesn't use TTLs.
May 1, 2003, 10:22 PM

Search