The implications of disaggregating your content
by Tom Beyer, Director of Publishing at iFactory
One of the consequences of putting content online is that it allows publishers to explode the notion of the book as a container. In many ways this seems like a great thing – you can combine content in different ways and even allow your users to pick and choose the content they are interested in. But of course this flexibility does have some restrictions – there are implications that need to be considered to assure you have the right work flows and systems in place to handle the new level of complexity in metadata and user transaction data. Nothing comes without a cost and in this case the cost is the added complexity of more closely managing your content.
The first thing you must ask yourself is if your content lends itself to being chopped up at a finer level of detail than the original book. If the content is highly narrative with a strong story line or argument then it is probably not appropriate for any sort of chunking – fiction and certain kinds of monographs are clear examples of this – no one wants just a few chapters from the middle of Bleak House or Infinite Jest.
But assuming that some sort of disaggregation does make sense for your content, the following describes just a few of the issues that need to be considered as your content moves online and you consider how you want to present it to your users.
Metadata
The most important thing to consider ahead of time is the metadata that needs to be created to allow your content to be disaggregated. If content is being combined into different collections, then there needs to be some mechanism for associating each item of content with the appropriate collection(s). These collections can be based on subject, time, theme – anything that makes sense for your list – the choice is up to you. But there needs to be a mechanism for tagging the data with the appropriate information and communicating that information to your online platform.
Taxonomies are increasingly used to help users find the content they are looking for. Often, publishers already have some of this information at the book level. To make the most of moving the content online this information really needs to be created at the chunk or chapter level. Often this means rethinking the taxonomy – does it need to be more detailed? Is there an industry standard taxonomy that you can use? If so, does your content map link to it or are there big holes where you don’t have relevant content?
One way to think of this is that all of the metadata you currently create and maintain at the book level needs to be maintained at the chunk (often chapter or entry) level. Depending on the kind of books you are entering into the system, this may be an increase in magnitude of the metadata that you manage. Is your system and business process equipped to handle this level of data?
Business Models
How do you intend to monetize the content? If the content is being packaged as part of a database that you plan to sell as a subscription service to librarians, be aware that the subscription model is receiving some increased resistance from libraries in the US. Also, subscription services demand a certain level of content updates to make them viable. Are you prepared to frequently update the system with new content?
One alternative to subscriptions is perpetual access. In this model libraries pay a higher initial fee to ‘own’ the content. This is usually a modest annual fee for the hosting costs of the platform that is providing the content. If this is offered in conjunction with the subscription model then there is usually the need to provide top ups so that perpetual access customers can gain access to the new content that is added to the database over time.
A third model – PDA [Patron Driven Access] pioneered by EBL – is proving increasingly popular with a number of the eBook aggregators providing it. You should consider whether PDA is something you want to consider for your content.
In all of these cases, you need to have internal systems to support selling content in these different ways. Is your current inventory and billing system up to the challenge?
Discoverability
It is vitally important to consider discoverability and SEO in conjunction with your online content. There has been an increasing trend to provide some portion of the content outside the access controlled firewall. Often quick search and the search results page are freely available if not some portion of the content itself. What happens when an unauthenticated user clicks on a result?
Ideally, the platform should at least show metadata about the content and potentially an abstract that entices the user to purchase or subscribe to the content. This means that the publisher needs to provide abstracts for each chunk or chapter of content – sometimes this is simple because the content comes with an abstract (most journal articles, for instance) – but if that’s the not the case, the publisher must either invest the time to create them or develop an automated solution.
Custom Collections & Custom Publishing
Should you decide to allow your users to select the content for their own custom collections, or to allow for custom eBook or custom POD books you will need to provide even more metadata to the online system. Pricing and authorial information will need to be provided at the chunk level which users can then select. This is critical so that the resulting custom eBook and POD books can be priced appropriately and your authors can be properly compensated. This means you need to get reports which indicate exactly what chunks of content were used and in what quantities, so you can then keep track of these figures for your own internal purposes. Your contracts with your authors then need to spell out how they are compensated when only a portion of their work is used in a custom publication.
Another important aspect to consider are the business rules surrounding how the content can be packaged. Is there a maximum amount of content that users can choose for a custom publication? A maximum amount from any single publication? A maximum total number of publications that can be picked from? Are there certain combinations that shouldn’t be allowed for whatever reason? If you decide to allow users to upload their own content is there a maximum percentage that they can put into a custom publication? Also, do you have any restrictions about where you can sell your content – can you sell worldwide? If so, do you need to control that in your online system? What about a currencies options? Tax and shipping costs for POD publications may also need to be considered.
PubFactory
Disaggregation and custom publishing are major components of our PubFactory online platform. We provide the tools to allow publishers to disaggregate their content but ultimately our publishers are on the hook to provide us with the necessary data to make the system work. Of course, we are happy to provide advice to publishers considering what and how to move their content online. Give us a call!
Recently, we announced another collaboration with Oxford University Press, this time on the design of the newly launched Oxford Bibliographies Online. Chris Reidy of the Boston Globe ran a news update that appeared online, and in the Globe’s digital newsletter.
It’s not often that we take a moment to pat ourselves on the back, or similarly, post press releases to our blog. We are so thrilled about our latest launch of Oxford Dictionaries Online, however, we wanted to share the news with you—our loyal blog readers. If you haven’t already spotted the release across the networks, we have included a few bits and pieces— and a link to the entire release—below.
Oxford Dictionaries Online is Oxford’s innovative modern English dictionary and language reference service. Featuring smart-linked, fully searchable content from Oxford’s largest modern English dictionaries and thesauruses, ODO provides comprehensive coverage of British, US, and World English with more than 350,000 definitions and 600,000 synonyms and antonyms.
PubFactory’s customizable features include a ‘My Oxford Dictionary’ feature for creating your own profile and saving entries and searches, an innovative alpha word wheel for browsing, and a user-friendly advanced search.
“We are delighted to have worked with iFactory in developing Oxford Dictionaries Online,” said Judy Pearsall, Head of Dictionaries, Oxford University Press. “This was a large custom-built project with complex data needs, and iFactory excelled in building an enticing and intuitive interface with underlying data sophistication and robust technology. Overall we’re pleased that it’s been such a good match of design, data skills, and rich content.”
The selection of PubFactory for Oxford Dictionaries Online continues a long-standing relationship between iFactory and OUP.
At a small tech gathering recently in Boston, font guru Paul Irish illustrated how the web is finally ready for a richer web fonts experience. For some time now, designers and developers have been stuck with only a handful of default “web friendly” fonts: Helvetica, Arial, Times, Courier, Georgia, among others. While a variety of techniques have cropped up over the years to satisfy the need for other fonts (such tools have included sifR, typeface.js, Cufón, and even text-to-image replacers), not all browsers incorporated native techniques for embedding unique fonts. This has been a source of frustration for designers trying to break out of the mold and do more sophisticated and exciting work, as alternative tools all have had implementation hazards and limitations of one kind or another.
This is all changing, as technology is crossing the threshold toward a brighter web fonts horizon. This isn’t to say the way ahead isn’t void of other challenges, particularly as concerns licensing. Rather, many of the key pieces are set to do the most basic of things: embed fonts natively into the browser with stunning visual results. So how is this possible (and why was it not really feasible earlier)?
The first place to look is at cross-browser implementation/support. Atypically, Internet Explorer is actually years ahead in the charge. They’ve allowed font embedding since IE4, where most of the other browsers have taken a while to catch up. (Though Opera actually introduced the spec.) From figures taken from StatCounter.com’s global stats website, 95% of the major browsers have native font embedding capabilities — that missing 5% being Firefox 3.0. (All other browsers support: IE6+, FF3.5+, Safari 3+, Chrome, and Opera.)
Continue reading…
Whether you arrived here by way of our new website, or by other means—we encourage you to explore our new site!


If you’ve ever visited a seemingly innocuous site only to find it filled with links to spam, gambling sites, and other inappropriate content, then you may be witnessing a SQL Injection attack. Over the past year, iFactory has dealt with SQL Injection attacks on several completely unrelated sites with technology ranging from ASP to ColdFusion.
In this post, we discuss how to identify SQL Injection attacks from web log files, and provide a solution for preventing attacks from generating spam on a ColdFusion site.