Filter Your Feeds With Yahoo Pipes

f you look at my blog roll, you may notice a few planets: Planet Cataloging, planet code4lib, and Planet RDF. These topical aggregators are quite handy, saving me much of the trouble of seeking out new and relevant blogs. I just subscribe to the feed for the aggregator and let the maintainer do all the hard work.

Aggregators do have their drawbacks, though. A fair number of blogs that I’ve no interest in reading get lumped in with those of value to me. Since some blogs are included in more than one feed, I also end up getting duplicates in my feed reader (Google Reader, if you’re curious). All of the clutter can be a nuisance and a time-waster. I need a way to filter out the chaff.

If you look at my blog roll, you may notice a few planets: Planet Cataloging, planet code4lib, and Planet RDF. These topical aggregators are quite handy, saving me much of the trouble of seeking out new and relevant blogs. I just subscribe to the feed for the aggregator and let the maintainer do all the hard work.

Aggregators do have their drawbacks, though. A fair number of blogs that I’ve no interest in reading get lumped in with those of value to me. Since some blogs are included in more than one feed, I also end up getting duplicates in my feed reader (Google Reader, if you’re curious). All of the clutter can be a nuisance and a time-waster. I need a way to filter out the chaff.

Ed Summers kindly suggested that I try out Yahoo Pipes (thanks, Ed). Pipes is a visual editor that lets you take input from various feeds (RSS, Atom, RDF, or iCal) and other data sources (XML, JSON, iCal, or KML), run the information through a series of filters, and output the results in RSS 2.0, RSS 1.0, JSON, and Atom.

I’ve now built my first pipe. Smaller Planets grabs the feeds from the aforementioned planets, removes any items with del.icio.us in the title (I prefer not to see what everyone bookmarked today) and removes any duplicates (based on the URL of the post). After just a couple of minutes getting Pipes figured out, I now have my custom feed to subscribe to.

Filtering out other content is easy, too. You can add rules to the Filter module to block, for example, posts from certain blogs. A rule that says “item.link contains xplus3.net” would block out anything I post to this blog. Feel free to clone Smaller Planets and customize it to create your personalized feed.

I’ve only just scratched the surface, but Pipes looks to be a powerful application; I look forward to playing around with it more. If you create any interesting or useful pipes, please leave a comment here; I’d love to see it.

Code4Lib Journal Announces Its Call for Submissions

The Code4Lib Journal went live today with its first call for submissions. The Journal targets the programmers, system administrators, and others who are developing the technology to move libraries forward. If this is you, consider submitting an article. If this is someone you know, be sure to make it known to them.

For more information, visit the Journal’s homepage, and go see Roy Tennant’s post about it, too.

The Code4Lib Journal went live today with its first call for submissions. The Journal targets programmers, system administrators, and others who are developing the technology to move libraries forward. If this is you, consider submitting an article. If this is someone you know, be sure to make it known to them.

For more information, visit the Journal’s homepage, and go see Roy Tennant’s post about it, too.

Synonym Expansion in Google

This may be old news to many: Google can do synonym expansion.

For example, a search for cataloging digital resources gives you about 20 million web pages containing those three words. Add a tilde in front of a word, and Google will also search for synonyms of that word. So cataloging ~digital resources returns over 60 million pages includes those words or synonyms of the word “digital”, such as “computer” and “electronic”.

This may be old news to many: Google can do synonym expansion.

For example, a search for cataloging digital resources gives you about 20 million web pages containing those three words. Add a tilde in front of a word, and Google will also search for synonyms of that word. So cataloging ~digital resources returns over 60 million pages includes those words or synonyms of the word “digital”, such as “computer” and “electronic”.

It’s a Blog!

This is a blog. My blog, in fact. It doesn’t have a name yet; I’ll probably do that on the eighth day.

So, why should I have a blog?, I keep asking myself. Truth be told, I’m not altogether sure that I should. But the last couple of months have persuaded me that it might be time to try one. It’s all about community.

I’ve never really been a part of an on-line community before, at least not in any meaningful way. But recently I stumbled across code4lib, which can best be summarized as a community of library technologists that interacts via e-mailing lists, an IRC channel, conferences, and blogs. After lurking on the IRC channel for a bit, I slowly began to join in on the conversations. I found my head filling up every day with all that I learned from this community. I even find myself, now, involved in the creation of a new professional journal because of my (short) affiliation with this community.

So this blog is my next step, engaging a little bit more with this community, and perhaps others that I might join in the future. I intend it to be a semi-professional blog, so I’ll probably steer clear of inflammatory topics like religion and politics unless they clearly relate to my professional activities. The blog will emphasize libraries and technology, especially those areas where they overlap, including metadata, digital libraries, and computer programming.

I doubt I’ll post all that often. Really, anything over a post every couple of weeks would surprise me. In a couple of months, I’ll assess where I am; the whole project may be a mistake that should be put out of its misery. If I decide to keep going with it, I’ll ask one of the admins to add it to the planet, so a few people might see it.