READIN: Programming (as of February 13, 2008)

(This is a page from my archives)

➻	Front page
➻	Most recent posts about Programming
➻	More posts about Projects

Archives index
Subscribe to RSS

This page renders best in Firefox (or Safari, or Chrome)

Programming

Posts about Programming

READIN started out as a place for me to keep track of what I am reading, and to learn (slowly, slowly) how to design a web site.

There has been some mission drift here and there, but in general that's still what it is. Some of the main things I write about here are reading books, listening to (and playing) music, and watching the movies. Also I write about the work I do with my hands and with my head; and of course about bringing up Sylvia.

The site is a bit of a work in progress. New features will come on-line now and then; and you will occasionally get error messages in place of the blog, for the forseeable future. Cut me some slack, I'm just doing it for fun! And if you see an error message you think I should know about, please drop me a line. READIN source code is PHP and CSS, and available on request, in case you want to see how it works.

See my reading list for what I'm interested in this year.

READIN has been visited approximately 236,737 times since October, 2007.

Wednesday, February 13th, 2008

🦋 `vim` rules!

Here's a neat trick: let's say you forgot to close a code block somewhere in a large C source file, and you can't figure out where, and the compiler is not helping you. Try putting a } character at the very end of the file, placing the cursor on it, and presing the % key -- vim will jump to the most recent { which was not matched by a }. (Note: this will not work if you have unmatched {'s inside conditional compilation blocks, which is generally a bad habit anyway.)

posted afternoon of February 13th, 2008: Respond
➳ More posts about Programming Projects

🦋 1024

A good thing to keep in mind when you are trying to write a TCP server that will support thousands of simultaneous connections: the default ulimit for maximum number of open files on a Linux system appears to be 1024. A further thing to keep in mind once you figure out how to adjust that upwards: there is a reason the default maximum is 1024!

See, if you use select() to multiplex your I/O (like I do), you will be passing structures of type fd_set around. These structures can only deal with file descriptors less than or equal to 1023. Try and set bit 1024, and you will break your program. But fear not! There is a solution; that solution is to use poll() instead of select(); apparently poll() is the new standard. First I ever heard of it! Switching from one to the other seems like it's not too hard, though I've just now started.

posted afternoon of February 13th, 2008: Respond
➳ More posts about Projects

Wednesday, January 23rd, 2008

🦋 watch

Cool! I found and fixed a bug today using gdb's watchpoint feature, which I have never tried before. (Not cool: the bug was a careless typo that I should never have introduced.)

posted evening of January 23rd, 2008: Respond

Saturday, November 10th, 2007

🦋 Comment Spam II

OK: The comment spam filter I have in place right now is working (so far); but it would be pretty easy to circumvent if a spammer was determined enough. But I have in mind a pretty simple way to expand it and make it secure, and way better than the captcha images that everybody hates. (Drawback is, it relies on Javascript, which not every browser supports. This could be gotten around a couple of different ways.) I am going to try and implement it over the next few weeks and then I will write it up and try to get other people using it -- it's way better than captchas. (I won't write it up until it's in place because the writeup would include information on how to get around the current, insecure filter I have in place.)

Update: Oh wait, no it actually wouldn't be much more secure than the current scheme. A little harder to get around I guess.

posted afternoon of November 10th, 2007: Respond
➳ More posts about The site

Friday, October 12th, 2007

🦋 Passed the first test

So in my log I see a bunch of requests today for

GET blog/?k=<keyword> \'\'
and(char(94)+user+char(94))>0 and 
\'\'\'\'=\'\'

where <keyword> is one of the keywords that links exist to on the site; and also I see that my script translated those requests to

<keyword> \\\'\\\' 
and(char(94)+user+char(94))>0 and 
\\\'\\\'\\\'\\\'=\\\'\\\'

before passing them to the database. So the queries just returned empty sets instead of wreaking whatever havoc they might have wruck unescaped. Yay PHP! Yay careful programming!

(Note: but while editing this post I realized there is a different kind of escaping that you have to do when you are writing to forms -- the < and > signs were translating to markup in my inputs. Funny I never ran into that problem on the old site, you wouldn't think it would be a PHP-vs.-ASP distinction.)

Update: So what do I have to do to ban these guys from my site? I tried putting the following in my httpd.conf:

<Directory (path to root of my site)>
    order allow,deny
    deny from (IP)
    deny from (IP)
    allow from all
</Directory>

and restarting the service, but that does not seem to have done it.

Another Update: I think I got it: the Directory directive in apache2/sites-available/default is overriding the directive in httpd.conf because httpd.conf is included first. I think I just need to take the default directive out.

posted evening of October 12th, 2007: Respond

Monday, October 8th, 2007

🦋 Categories

Like I said below, I don't have much experience with database design. I don't really have any clue how to write a design document. But I want to describe the design I've come up with and see if I can make it sound as good as it appears to me to be.

The thinking behind this is as follows: I have a lot of text records ("posts") which I want to classify by subject. I've done this, just like every other blog around, by using keywords -- if I tag a post with "food" say, or "singing", then it will show up when somebody looks at the site filtering for that subject. This is implemented with a simple search through the list of keywords on each post; not particularly fast but that's not a major problem in the context of my low-traffic site.

But when I was putting the new software together, I had the idea that it would be great if, when somebody looked at the blog filtering for "food", they would see a little sidebar explaining what I write about when I write about food, and maybe some links to food sites I like etc. And more to the point, when somebody filters for "book:namered" (which is how I've been tagging my reading posts, "book:" and then a short identifier for the title), they would see up top that the posts were about My Name is Red by Orhan Pamuk, links to some outside reviews, links to Amazon and Abebooks, maybe a list of other of Pamuk's books that I have written about. So that is the problem I am trying to solve; and I think my solution is a pretty good one.

First, simple keywords, like "food" and "singing". This is pretty easy; I have a table keyword with columns tag and description -- the description is what will be displayed in the sidebar when somebody filters by the tag. And I have a table (which I decided to name categories, for reasons that will soon become apparent) with two columns, postid and keyword -- I can join this table with posts when I want to do a filtering operation.

Now what about the complex keywords like "book:namered", which include a class and an instance? Well check it out: every time I add a keyword which has a new class, I can just add a column to the categories table with the class name as the column name. And add a table with that name, which looks the same as the keyword table. And think of simple keywords as a special case of complex keywords, as if they had "keyword:" in front of them. So if somebody requests a filter for "book:namered", I can query from "posts JOIN categories ON posts.id = categories.postid JOIN book ON categories.book = book.tag" where book.tag = "namered". This will work for movies, projects, whatever. But the really cool thing is, I can add whatever columns I want to the book table and write a custom script to display the data associated with the tag "namered" in my sidebar.

Consider these three requests:

SELECT posts.* FROM posts JOIN categories ON posts.id = categories.postid WHERE categories.book = 'namered';
(This query would be represented by the keyword "book:namered".)
SELECT DISTINCT posts.* FROM posts JOIN categories ON posts.id = categories.postid JOIN book ON categories.book = book.tag;
(This query would be represented by the keyword "book:".)
SELECT posts.* FROM posts JOIN categories ON posts.id = categories.postid JOIN book ON categories.book = book.tag WHERE book.author = 'pamuk';
(This query would be represented by the keyword "book:author:pamuk".)

The first query will bring back all posts about My Name is Red. The second query will bring back all posts about reading any book. The third query will bring back all posts about reading any book by Orhan Pamuk. And all this is pretty easy to automate! It's all nearly in place!

The next step, which will be a bit of effort to keep it elegant but totally within reach, is to create an administrative page for writing scripts to render an informative sidebar based on the column data contained in, say, the "namered" record in books.

posted evening of October 8th, 2007: Respond
➳ More posts about SQL

🦋 Programming head

Is a head I like to be in. For like a week now I've been thinking non-stop about the design of the site, how I can put features in and have the code look elegant and run quickly, what features belong in a coherent model. It gives me a real feeling of focus, like I have when I'm reading a book that I'm really absorbed in. It can be annoying not to be able to focus on other stuff, but oh well, it's pretty much worth it.

posted evening of October 8th, 2007: Respond

🦋 On reinventing the wheel

When I was new to programming, in 1994 or '5 -- when OLE was a pretty freshly minted technology -- one of the projects I was working on was a way to abstract the functionality of some of my company's libraries into a common interface so that a program could load any of the libraries dynamically at runtime, based on a string key. I came up with the stunning realization that the interface could be expressed as a pure virtual C++ base class. All the libraries had to do was to export a function called "Create_x" which would instantiate an object whose class inherited interface x.

This seemed to me like an awesome bit of innovation. By funny coincidence, another project I was working on around the same time was converting some of the company's VBX controls to OCX. (I don't think the term "ActiveX" had even been coined yet, but regardless we were not using it.) I wasn't reading the documentation of OLE very closely, relying on Microsoft's compiler to do most of the work for me; so it wasn't until a month or so later that I realized I had just reinvented a subset of OLE, and that I could have used OLE's framework to give my design a little more robustness. But whatever, the feeling that I was doing something new and inventive was payoff enough.

So why this now? Well, I've been doing some pretty intensive design work in coming up with the database that supports this blog ("READIN 2.0", I am calling it in my head), and I have come up with a pretty cool idea. It seems innovative to me because it is something I've never heard of anyone doing; but I am not at all schooled in database design. I will write it up later on or tomorrow, and hopefully somebody will write back to me and let me know who invented it and where I can find out more.

posted evening of October 8th, 2007: Respond

Sunday, March 11th, 2007

🦋 Interval

Here is a bash script to determine the interval between two date/times. Parameters are two dates, specified using any format the date utility can recognize; if the second parameter is omitted, "now" is assumed. Output is the number of seconds between the two, followed by "d h:m:s" format.

 #!/bin/bash
 
 if [ $# -eq 0 ]
 then
         echo Usage: `basename $0` \ \[\\ default \"now\"\] >&2
         exit -1
 fi
 
 start=`date +%s -d "$1"`
 if [ $# -eq 1 ]
 then
         fin=`date +%s`
 else
         fin=`date +%s -d "$2"`
 fi
 
 res=`expr $fin - $start`
 if [ $res -lt 0 ]
 then
         res=`expr 0 - $res`
 fi
 
 echo $res sec
 d=`expr $res / 86400`
 t=`expr $res % 86400`
 h=`expr $t / 3600`
 ms=`expr $t % 3600`
 m=`expr $ms / 60`
 s=`expr $ms % 60`
 if [ $d -gt 0 ]
 then
         echo -n $d day
         if [ $d -gt 1 ]
         then
                 echo -n s
         fi
         echo  -n \
 fi
 if [ $t -gt 0 ]
 then
         echo -n $h\:
         if [ $m -lt 10 ]
         then
                 echo -n 0
         fi
         echo -n $m
         if [ $s -gt 0 ]
         then
                 echo -n \:
                 if [ $s -lt 10 ]
                 then
                         echo -n 0
                 fi
                 echo -n $s
         fi
 fi
 echo

posted evening of March 11th, 2007: Respond

Friday, January 13th, 2006

🦋 Sum of 2 different squares, 3 different ways

Over at Unfogged, Frederick suggests that 325 is the smallest number which can be expressed as a sum of two perfect squares three different ways. I just wrote a program to check this which confirms Frederick's suspicion; here it is if you want to check my logic.

 #include 
 
 int perfect[] = {
     1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 
     11 * 11, 12 * 12, 13 * 13,
     14 * 14, 15 * 15, 16 * 16, 17 * 17, 
     18 * 18, 19 * 19, 20 * 20
     };
 
 bool IsSumOfSq(int s, int &a, int &b, int x1, int x2)
 {
     for (int i = a + 1; i < 20; ++i)
     {
         if (s < perfect[i])
             return false;
         int diff = s - perfect[i];
         for (int j = 0; j < 20; ++j)
             if (j == x1 || j == x2)
                 continue;
             else if (perfect[j] == diff)
             {
                 a = i;
                 b = j;
                 return true;
             }
     }
 }
 
 int main()
 {
     int i;
     for (i = 0; i < 400; ++i)
     {
         int a = -1, b;
         if (IsSumOfSq(i, a, b, -1, -1))
         {
             int c = a, d;
             if (IsSumOfSq(i, c, d, a, -1))
             {
                 int e = c, f;
                 if (IsSumOfSq(i, e, f, a, c))
                 {
                     printf("%d = %d^2 + %d^2\n"
                           "    = %d^2 + %d^2\n"
                           "    = %d^2 + %d^2", 
                         i, a + 1, b + 1, c + 1, 
                         d + 1, e + 1, f + 1);
                     break;
                 }
             }
         }
     }
     return 0;
 }

Output:

325 = 1^2 + 18^2
    = 6^2 + 17^2
    = 10^2 + 15^2

posted evening of January 13th, 2006: Respond

Previous posts about Programming
Archives

Drop me a line! or, sign my Guestbook.
•
Check out Ellen's writing at Patch.com.

`READIN`

Jeremy's journal