Search Design Pattern on Flickr
Peter Morville (author of Ambient Findability, a really good read)
has started a series of collections relating to search interface design patterns on Flickr. It is brilliant, check it out.
Faceted Search With Forage
Today I added support for faceted search to Forage. Faceted search is a way of drilling down into search results by filtering on particular fields or categories. The example below is relatively simple in that it only has one facet, category, but there is no reason why you can't have multiple different 'facets'. This is, in fact, done quite often and very successfully in product searches.
And on to the example
This example is an extension of the previous example, only this time we'll be using one of the BBC's news feeds. The reason we're going to use BBC's new feeds this time is that they have an extra element in the feed, category. We're going to use this category as a facet field and then list the most popular categories in the feed.
<?php
require_once dirname(__FILE__) . '/../lib/Forage.php';
require_once 'Zend/Feed.php';
$feed = Zend_Feed::import('http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml');
// we're now using Solr as the engine, more about this later...
$forage = Forage::create('solr:127.0.0.1:8080');
foreach ($feed as $item) {
$document = new ForageDocument();
$document->add('title', (string)$item->title())
->add('link', (string)$item->link(), array('indexed'=>false))
->add('description', (string)$item->content(), array('stored'=>false))
// we mark the facet field as such at index time
->add('category', (string)$item->category(), array('facet'=>true));
$forage->add($document);
}
$forage->flush();
// create an empty query and tell it that we want to
// facet on 'category' before searching
$query = $forage->getQuery();
$query->setFacetFields(array('category'));
$response = $forage->search($query);
echo "Total Results: " . $response->getTotal() . "\n";
// get the category facet from the response
$facet = $response->getFacet('category');
$i=0;
// loop over the category values showing the top three
// along with the number of documents it apears in.
foreach ($facet->values as $value) {
if ($i++>3) {
break;
}
echo $value->value . " (" . $value->count . ")\n";
}
At the moment it's only supported by the Solr engine but Xapian will be adding faceting support in version 1.1. You can check the code out of subversion and more detailed documentation is in the wiki.
Introducing Forage – Search Abstraction for PHP
Recently I've been working on a search abstraction library for PHP called Forage. The idea is
to bring to search what we've had for relational databases for quite a while, abstraction. On Friday I put up a preview release with three
backends; Solr, Xapian and Zend Search Lucene. At the moment it has the bare minimum of features but there will be more soon. In this post
I'm going to talk a little about the motivation for the project and then walk through a short example.
So why do we need search abstraction?
The reasons for wanting an abstraction library for search are pretty much the same as for databases. Ease of integration and resilience to change.
Ease of integration
If you have one interface which provides access to multiple backends then a framework (or other application) can use this interface and then allow
the user to choose which backend to use depending on their needs and abilities. It also allows the users of the framework to scale their solutions
as they grow, this is really the second point though.
Resilience to change
If you have one interface which provides access to multiple backends then once you've implemented your solution you can change the backend if you
need to. With relational databases this is rarely done but with search, certainly in PHP at the moment, there is a bit more of a need for it. Let's
say you have a small site which does something cool. You need a search solution up and running very quickly without rocking the boat too much so
you use ZSL and it works very well. However, your site starts to get more popular (as sites which do cool things do) and it starts to creak,
you decide you need to scale up to a more capable solution such as Solr. If you're not using an abstraction layer, at this point you have to
re-implement your search module. With Forage you just need to set up your Solr server and change the DSN from 'zsl:/path/to/index' to 'solr:host:port/path'
and re-index. Job done!
Enough talk, let's play!
To show you how easy it is implementing search with Forage let's run through a little example. For this example I'm going to index some data
out of an RSS feed. I'll be using Zend_Feed from the Zend Framework and for the backend
to Forage I'm going to use Xapian. I'm just going to index all the items and then run a search over the index.
<?php
require_once 'Zend/Feed.php';
require_once 'Forage/Forage.php';
// import the feed
$feed = Zend_Feed::import('http://rss.slashdot.org/Slashdot/slashdot');
// initialise forage
$forage = Forage::create('xapian:/var/xapian/slashdot');
// iterate over the feed items
foreach ($feed as $item) {
// create a new document
$document = new ForageDocument();
// add some fields to it
$document->add('title', (string)$item->title()) // will be both indexed and stored
// won't be indexed but will be stored
->add('link', (string)$item->link(), array('indexed'=>false))
// will be indexed but won't be stored
->add('description', (string)$item->content(), array('stored'=>false);
// add the document to the index
$forage->add($document);
}
// flush the changes to the index
$forage->flush();
// search over the index
$results = $forage->search('yahoo microsoft');
foreach ($results as $document) {
echo $document['title'] . "\n";
}
That's not bad is it? A feed indexing program in under 70 lines of code. If you're interested then get over to the
Forage download page and give it a whirl, and if you can, get involved.










