Make a New WordPress Loop with query_posts

The “Useful WordPress Function of the Day” award goes to query_posts. This function can be used to:

  • Revise the query that WordPress forms from the URL, so you can change the sorting of posts, exclude certain categories, etc.
    query_posts($query_string . "&order=ASC&category_name=Libraries");

    This takes the current query and sorts it in ascending order, limiting the results to posts in the “Libraries” category.

  • Create custom queries, either for public-facing pages or in administration plugins.
    query_posts(array(
      "category__in" => array(1,3),
      "posts_per_page" => -1,
      "author" => 5
    ));

    This query grabs all the posts by author 5 in categories 1 or 3.

After calling query_posts, you can use your standard WordPress loop, along with all the template tags it makes available, in your template or plugin.

jQuery and Ajax in WordPress Plugins - Public Pages

My previous post teaches you how to use jQuery and Ajax for the administration pages in your WordPress plugins. To use them in your user-facing pages requires a few changes.

We’ll use here a simlarly contrived example. Let’s say you use <!--more--> in your longer posts so they don’t fill up too much of your page. Normally, clicking the “Read more…” (or whatever text you use) link takes the user to a separate page with the complete post. In our example, rather than sending the reader to a new page, we’ll make an Ajax request to get the rest of the post and insert it directly into the current page. More…

jQuery and Ajax in WordPress Plugins - Administration Pages

This is a quick overview of how to use jQuery and its Ajax functions in WordPress. To get the point across, I’ll use a simple and contrived example. We’ll have an admin screen with a list of categories. Clicking on the name of one of the categories will fetch a list of titles of posts in that category and display them as a sub-list of that category. More…

700,000,000,000 Is a Very Big Number

I’ve no desire to get into the consequences/benefits of the recent financial industry bailout/rescue on this blog. Yes, I have opinions about it, but the opinions of a librarian and web developer with formal training in neither economics nor politics matter little in this discussion.

What I do have to say, though, is that $700,000,000,000 is a lot of money. So much so that I don’t know how much money it is. I think of money in terms of things/services I can exchange it for, as it really has no other use (unless you burn it to heat your home). I know what $1 will get me: a few bananas, or perhaps a used book. With $100, I could buy a couple of weeks of groceries or pay for a visit to the doctor.

Beyond that, money starts getting a little more abstract. I don’t need 4 years worth of groceries at once, so I can’t imagine spending $10,000 to buy those groceries. At that level, I can’t think of money in terms of groceries any more; I have to move on to larger, more expensive items, like a car, or a marimba, or a house.

And that’s where my concept of money starts to break down. A million dollars is about the most that I can conceive of; anything more than that is too abstract. The sun is 93,000,000 miles away. How many times will I have to drive to work before I drive 93,000,000 miles? It will take me about 25,000 years, it looks like. A billion is a meaningless number; you might as well say “a lot”. 700 billion is just “a lot more”. What’s the difference?

I certainly don’t know, and I doubt anyone dealing with our economic woes does, either. We’re just throwing big numbers around hoping that they’re big enough that people say, “Oh, that’s a really big number! It must be important!” If everyone believes the number is big enough, faith in the credit markets will be restored, and home prices will increase 40% a year forevermore.

Command Line PDF Editing

As I’ve mentioned before, Acrobat’s JavaScript API lags far behind other Adobe applications. Its limitations turned a seemingly simple project I was working on into an exercise in futility.

Overview

I have a collection of a little over 5,000 PDF files, the output of an OCR job. Each file contains one page of a newspaper. Four pages put together would make one issue. My goal, then, is to take four PDF files (e.g., 1896-09-24_001.pdf, 1896-09-24_002.pdf, 1896-09-24_003.pdf, and 1896-09-24_004.pdf) and merge them together into one file (1896-09-24.pdf). Then repeat 1,300 times or so to get the rest of the issues.

Sounds like an ideal job for a small script. Unfortunately, Acrobat only gives JavaScript access to the file system for opening and saving files. It has no way to read a directory for a list of files, which is rather fundamental to the task at hand.

PyPdf Almost Works

At Jay Luker’s suggestion, I tried out PyPdf. It seems to do everything I needed. And indeed it would, except it can’t read my PDF files. It does its job just fine with other files, but not these that OmniPage created. It seems the files are missing an attribute that PyPdf looks for, so I end up with a KeyError.

pdftk to the Rescue

So, after much gnashing of teeth, Jason Ronallo suggested I try pdftk, a simple command line tool that can merge and split PDFs, among other capabilities. To merge the issue noted above takes just one line:

pdftk 1896-09-24_001.pdf 1896-09-24_002.pdf 1896-09-24_003.pdf 1896-09-24_004.pdf cat output 1896-09-24.pdf
 

A simple Python script can just call pdftk repeatedly to take care of the whole collection.

Many thanks, pdftk and #code4lib.