Where the spirit does not
work with the hand,
there is no art.

- Leonardo da Vinci

0 1 0 0 0 4


I had been working for a while on a program that can capture other windows' contents and manipulate them with mouse clicks, key presses, etc. There are millions of possible uses for this including playing games, automating processes, and interfacing with applications that don't have easily understood messaging schemes (which is pretty useful for home theater control).

What I discovered is that I needed a good way to do simple image recognition, like differentiating between fifty-two cards in a deck or buttons in a window. There was no need to go all out and build and train a neural network for this, since the target images would be static and located in pre-defined positions. All that was necessary was to determine a set of points to examine on each image that could uniquely identify it.

I ended up trying a lot of brain-dead algorithms before I stepped back from it and determined how it had to be done. The first step was realizing that I would need at least log2 n sample points, where n is the number of distinct values. This is because I was measuring points based on whether they fell inside a certain saturation threshold, so each point was essentially a true or false flag. In the case of recognizing cards, I could first determine rank (5 points, 25 = 16 > 13 possible sets), then suit (which I ended up hard-coding due to its simplicity).

I wrote this little Windows Forms tool in C# to help facilitate the process of finding points and translating them into image recognition code. I haven't learned how to compile C# so it can run on machines other than the one where it was created, but this probably isn't very useful to you if you don't know how to write code in the first place. If you need it, you probably know how to build it.

. irhelper.cs

[ Next]

[identity] [home] [verbosity]


[ ]