Colored letter that imply it is secret hacking code

fopen() for Databases?

This is a VERY old post of mine from 2009 that I’m sharing for sport.

This entry assumes that you have some prior knowledge of databases, PHP, or other programming languages. However, for those that don’t, I’ll try to sum it up in a way for visitors to understand.

The statement “fopen”, or as PHP would like me to call it “fopen()”, is PHP’s File Open command. With this file you can read, write, and append to files — a wonderful feature. In my programming past, I never had a single issue with my applications using similar functions (OPEN in BASIC, for example). I used them for creating a file-name cache system, searching records, storing information, etc.

Then came the true test — An online message board that allowed multiple people to access my databases at one time. Because of my single-threaded past, I never suspected the issues that FastCGI (which is a multi-threaded application that runs several instances at one time) would present.

As two users were posting to a topic at one time (or even multiple topics at one time), the server became haywire trying to sort out the multiple file reads of the same file, which opened, inserted data, and sorted the results; finally saving it.

For those who have developed single-user at-a-time applications, you may not know what a “Race Hazard” is.

For an elementary explanation: Imagine two children drawing on the same one-inch sheet of paper, each with large markers. Eventually they will bump hands, distorting what their original intention was.

Well, because PHP (particularly PHP using the FastCGI interface) will open several processes, each capable of operating independently (and unaware of each other), it is similar to having two children writing in the same box. An easy way to fix this for Linix servers would be setting a locking flag, but for Windows users, it isn’t so easy.

You now have an index, or collection of files (please tell me they’re not part of a life-support system), that has all kinds of arbitrary data strewed about inside of it. The script that reads this file cannot make any logical sense of what is going on, because your system of using line-feeds for field delineation was a failed approach, and the dates are appearing in the post field, the post is the username, and the date isn’t showing because it isn’t a numeric variable, so the DATE() function can’t make any sense of it.

So how can fopen be used for databasing? Well, first things first, I find that learning MySQL is highly beneficial, as the cache improves speed, the row-locking eliminates the race hazard, and one-file backups for a site? Well…I don’t think I need to say more.

Can it be done? Yes, it sure can. But, as I’ve painfully learned: Why re-invent the wheel?

2023 Update: This is still true and accurate, though what I didn’t get into was ACID compliance. In particular, I neglected that you want to ensure atomicity of the identifiers being created and that the transactions are processed FIFO (first in, first out). This stops multiple processes from stomping on each other.


Posted

in

,

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.