Login

This is not rate-based. You don't attempt to figure out if a given IP is flooding you, just if the IP is on a whitelist.

A tree IS an array.

The time it takes to open a text file is irrelevant, all that is doing is giving you a file descriptor. The slow part comes in when you're looking through a few thousand line text file for something( if that's how you choose to do it )

Quote:(09-10-2013, 04:02 AM)w00t Wrote:
[To see links please register here]
This is not rate-based. You don't attempt to figure out if a given IP is flooding you, just if the IP is on a whitelist.

Given wikipidias description, in this scenario the server is continuously and granularly (per GET request) It needs to determine a traffic anomaly, and attempts to let legitimate traffic flow through.

Quote:(09-10-2013, 04:02 AM)w00t Wrote:
[To see links please register here]
A tree IS an array.

No. A tree consists of node objects, which each hold a reference to an object, and the next node(s) in the tree, if there are any. An array has a fixed size, and must reserve memory for every object it can hold, even if the object hasn't been created yet.

The time efficiencies are also different (sqrt(n) vs. 1)

Quote:(09-10-2013, 04:02 AM)w00t Wrote:
[To see links please register here]
The time it takes to open a text file is irrelevant, all that is doing is giving you a file descriptor. The slow part comes in when you're looking through a few thousand line text file for something( if that's how you choose to do it )

I'm not sure how your web server works, but mine looks through a map of pages to file names, opens the file, sends it as bytes, and closes the file again. I'm confused on why I would be looking though a few thousand line text file?

The key is that you do only enable the screening when an anomaly is detected, but the screening itself is not based upon if a particular request seems to be triggering the anomaly.

I've not encountered in any relatively low level language( I know Matlab has it, and so does R ) that have the tree type built-in. It tends to just be a re-declared array. You can maintain dynamically sized arrays by using pointers, but that's beside my point. Even if you only are using the number of bytes required by the size of the whitelist, that isn't scalable. In the worst-case, you need around 7 or 8 sub-trees per IP you want whitelisted. Assuming you use the smallest data type( in C ), that's still 7 or 8 bytes per ip. That's not the best to try to store in RAM. If you store them in a file, you can open the file once, declare it as a superglobal, and read it when needed, searching for the IP to test and returning true if it's present.

I'd overlooked the part where the file actually gets read, which is why I only included a rudimentary description of what it is to open a file.

Quote:(09-10-2013, 07:11 AM)w00t Wrote:
[To see links please register here]
The key is that you do only enable the screening when an anomaly is detected, but the screening itself is not based upon if a particular request seems to be triggering the anomaly.

A separate process can be run to monitor the system's resources, screening can be triggered by resources exceeding some threshold value.

Quote:(09-10-2013, 07:11 AM)w00t Wrote:
[To see links please register here]
I've not encountered in any relatively low level language( I know Matlab has it, and so does R ) that have the tree type built-in. It tends to just be a re-declared array. You can maintain dynamically sized arrays by using pointers, but that's beside my point. Even if you only are using the number of bytes required by the size of the whitelist, that isn't scalable. In the worst-case, you need around 7 or 8 sub-trees per IP you want whitelisted. Assuming you use the smallest data type( in C ), that's still 7 or 8 bytes per ip. That's not the best to try to store in RAM. If you store them in a file, you can open the file once, declare it as a superglobal, and read it when needed, searching for the IP to test and returning true if it's present.

ArrayLists(dynamically realizable arrays) are also not arrays. As for built-in types, different problems often require different trees, and generally are re-coded per the requirements of the problem. You need exactly 12 subtrees per ip, but this isn't very much, let's do some math. Objects in python (because it's the easiest for me) cost 36 bytes per object

[To see links please register here]

, pointers cost 32 or 64 bytes, depending on the computer; let's assume 64. Finally, actually storing the "trust level" should cost 12 bytes (an int)

[To see links please register here]

. Each new IP costs a MAXIMUM of 12 objects, 11 pointers, and 1 int; for a grand total of 1148 bytes. Let's pretend that the maximum space is taken before the algorithm begins to optimize (worst case scenario)
We still come out to have .3mb (292740 bytes) for the entire range of 0-255.x.x.x

even if you have a low end computer with 512 mb of ram

[To see links please register here]

it's still feasible to store.

storing them in a file will be unhelpful (especially during a dos attack) because it will take far too long to open, search through, and close the file again.

I'm still trying to write some sample code, as a proof of concept, but I've run into bug after bug, and I have a class in about half an hour.

The idea of a rate-based is that you screen and filter out high-rate traffic. That screening and filtering mechanism is the very thing you're trying to innovate.

1148 B = ~0.001MB.

Let's say we have 750 users on the whitelist( about what 0day.red has, and keep in mind this is a relatively small form ), and we'll be nice and say only 500 use the full amount of bytes. That's half a MB. Still insignificant on a server, but important to recognize it's scalability issues.

Take ha-ck-forums as an example. Their most active users in 1 day was ~2000. If we assume the same percent of people use the full memory, we get ~1332 using the full memory requirement, totaling at 1.5 MB, and that assumes we only care about those 2000, 0.5% of the total registered users.

A blacklist would be easier, and would be a true rate-based system. When a certain type of IP( maybe an IP range ) is detected as an attacker, create a regular expression and add it to an array to test IPs against.

Quote:(09-11-2013, 01:00 AM)w00t Wrote:
[To see links please register here]
The idea of a rate-based is that you screen and filter out high-rate traffic. That screening and filtering mechanism is the very thing you're trying to innovate.

1148 B = ~0.001MB.

Let's say we have 750 users on the whitelist( about what 0day.red has, and keep in mind this is a relatively small form ), and we'll be nice and say only 500 use the full amount of bytes. That's half a MB. Still insignificant on a server, but important to recognize it's scalability issues.

Take ha-ck-forums as an example. Their most active users in 1 day was ~2000. If we assume the same percent of people use the full memory, we get ~1332 using the full memory requirement, totaling at 1.5 MB, and that assumes we only care about those 2000, 0.5% of the total registered users.

A blacklist would be easier, and would be a true rate-based system. When a certain type of IP( maybe an IP range ) is detected as an attacker, create a regular expression and add it to an array to test IPs against.

This seems the most logical so far, but what about a botnet DDoSing from all around the world? How would the system handle those IP ranges without ill effects on legitimate users?

Quote:(09-11-2013, 01:16 AM)Sinisterkid Wrote:
[To see links please register here]
This seems the most logical so far, but what about a botnet DDoSing from all around the world? How would the system handle those IP ranges without ill effects on legitimate users?

Only filter out a /26 subnet, and remove then when done. Prevents abuse of dynamic IPs( for the most part ) with minimal impact on legitimate users, as even those caught in the subnet are only disallowed during the attack.

Quote:(09-11-2013, 02:12 AM)w00t Wrote:
[To see links please register here]
Only filter out a /26 subnet, and remove then when done. Prevents abuse of dynamic IPs( for the most part ) with minimal impact on legitimate users, as even those caught in the subnet are only disallowed during the attack.

Makes sense. I like that better than what cloudflare does, then your whole site is down and it's pointless anyway.

Quote:(09-11-2013, 01:00 AM)w00t Wrote:
[To see links please register here]
The idea of a rate-based is that you screen and filter out high-rate traffic. That screening and filtering mechanism is the very thing you're trying to innovate.

1148 B = ~0.001MB.

Let's say we have 750 users on the whitelist( about what 0day.red has, and keep in mind this is a relatively small form ), and we'll be nice and say only 500 use the full amount of bytes. That's half a MB. Still insignificant on a server, but important to recognize it's scalability issues.

Take ha-ck-forums as an example. Their most active users in 1 day was ~2000. If we assume the same percent of people use the full memory, we get ~1332 using the full memory requirement, totaling at 1.5 MB, and that assumes we only care about those 2000, 0.5% of the total registered users.

A blacklist would be easier, and would be a true rate-based system. When a certain type of IP( maybe an IP range ) is detected as an attacker, create a regular expression and add it to an array to test IPs against.

Alright, I did a thing, and here are the results:

Useing the code:

Hidden Content

You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Searching for the ip 100,000 times takes 1.0253695160000689 seconds, slightly more then opening, reading, and closing a small text file, effectively doubling the time it takes to serve .html files.

Additionally, non-added addresses take less time (xbar of 0.512684768, with a standard deviation of 0.002628457 per 100,000) to realize they are not in the list.

To find memory taken, I used pympler, replaced the code at the end with

Hidden Content

You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Memory taken for a normal distribution of 400,000(H-F) unique IP address around 113.113.113.113 with a standard deviation of 50 yields the following:

Hidden Content

You must

[To see links please register here]

or

[To see links please register here]

to view this content.

I'm a little on edge to even try to look at it right now, I'll figure it out after a little dota2. Just looking at 2.36 gb I assume it means my method is worse than other methods out there. Thanks to everyone, and to w00t for his persistence.

this is a horrible way of doing it. When a user connects to a web page he connects to the server first through an IP and a port (thats all a url is, a masked IP address on port 80, 443 or 8080). A trust system would work if you're trying to stop forum post bombs but not DoS or DDoS attacks and even then it would be inefficient.

fordhdabyewlym

maturation634

carmelo979113657

Mrcrampon5

sagittal217

additament118

tude829

susieghk

oestruation997513

Drplurality3