Monday, July 26, 2010


The geek in me gets a hold of my brain sometimes, and when it does, that can be as fun as writing.

Here's what I've been thinking about recently: Publishing is a stochastic process.

Okay, I'll back up.

In physics, when you've got a large system comprised of small bodies, and you can predict what the system as a whole will do but not what one individual body will do, that's a stochastic process.

An example of this is a gas comprised of molecules. Imagine you compress the gas into a smaller volume. What will one individual molecule do? You can't say. It might collide with another molecule. It might move up or down or to the left or it might stay still. It might rotate, for heaven's sake.

But the gas--you can say things about the gas as a whole. You can say that its temperature and pressure will increase.

Publishing is a stochastic process. You can't say with any accuracy what one random person wandering through a bookstore will or won't buy. You might, however, be able to predict roughly how many copies of one particular book will sell to the whole population of random people wandering through bookstores.

What makes publishing such a dangerous occupation is they have to make these kinds of predictions all the time, and if they screw up--even on just one book a year--they stand to lose a lot of money. Too large a print run can devastate the company's bottom line; too small a run can let a potential bestseller slip into oblivion.

Now here's the thing: Physicists get hired by financial companies to mathematically model the stock market. These physicists occasionally can come up with better predictions for what's going to happen than the guesswork of savvy and experienced professionals is able to provide. And even a tiny edge can turn into massive profits when it comes to something as variable as the stock market.

So this makes me wonder if anyone's ever tried to mathematically model book sales, i.e. tried to predict the rate at which something will sell initially and how word-of-mouth will affect its sales. Anything that keeps those few, disastrous screw-ups from happening could make a huge difference to the publishing industry.

I've thought about the problem a bit. It could be done, but you'd need some input parameters that you could only get by quizzing readers (about 30 to get a statistically valid sample) who are the sort of person who'd potentially buy that kind of book.

You'd need to ask them how well the cover, blurb, and sample page draws them in (i.e. convinces them to buy the book), then quiz them again after they've read the book regarding whether they liked/hated it enough to mention that fact to a friend or two.

And that's the tricky part, because while publishers would be the most benefited by having access to the modelled data, an internet-based book seller (particularly one with a mighty database like Amazon has) would be better equipped to do the initial study. They could offer incentives to readers in order to get feedback on a new book.

Me and my rusty memory of statistical physics are still working on the model, but just think how freeing it would be to the publishing industry if they could get a system in place that helped them avoid those few, but appallingly costly, mis-steps that plague their profit margin.

Also consider how it might help quirky authors find their market; if the publisher could predict how many copies of a particularly oddball novel it can sell (via booksellers), then they could adjust their print runs to make a profit on even on less commercial books.


What do you think? Have you heard of this being done already? (If there's money at stake, surely someone's taken a stab at it...) Do you think it's possible to pin down something as variable and unpredictable as individual taste and the zeitgeist of the public?

What input parameters do you think such a model would need? I've thought about author brand, enticement of title, pretty covers, prominence of bookstore placement, word-of-mouth, etc. etc...

Author website: J. J. DeBenedictis

Pageloads since 01/01/2009: