Shahar Solomianik

Since Twitter allows only 140 characters…

Posts Tagged ‘Crowd Sourcing

Crowd Sourcing Sucked, So I Put Instagram API to Work Instead

with 30 comments

I’m a hacker. And a surfer (Surprisingly, not such a rare combination). About a year ago I decided I’ve had enough with the poor quality of real time surf reports. Advanced surf forecasts are available, some surf cams on various beaches are installed, and yet without having a precise view of the surf at any given moment, I found myself too often wasting precious time (otherwise spent hacking) driving to the beach to find poor surfing conditions, or on the other hand, dismissing the drive just to be sorry later when I got the “dude, it was pumpin so awesome, u should have come!” line from someone who did surf.

So I developed [the 1st version of] SwellPhone, a service for “producing and consuming real-time surf reports” in my declarative way of describing it. I reckoned that if I give the surfing community a tool to share photos and videos of the surf that they take just before or after surfing, I will revolutionize real-time surf reporting. Hell, why not? You have a smartphone right? Why not point it to the surf, take a video or photo and help other surfers tell the surfing conditions?

It was a noble idea. It even got to the front page of HackerNews. But I was so naive… Everybody wanted to “consume” the surf reports alright. Just about everybody. However, No one was willing to “produce” a surf report. Absolutely no one. And without anyone producing, there was nothing to actually consume.

I was trying to crowd source surf reporting. But the crowd simply didn’t want to source. Crowd sourcing sucked.

I let SwellPhone linger and diverted my energy elsewhere

Until some day, when while at Intsagram, I stumbled upon this pic of a cute little girl in the Maldives, and it made me laugh, so I Gimped it to express my thoughts:



Then I took another look at the pic and it hit me: Any surfer within a driving distance from this awesome tube who would get to see this photo in a timely manner after it had been taken, would simply leave everything and rush to surf there. Dude, this is not a pic of a cute girl. This is a surf report!

Wait, it’s taken from Instagram, do they have an API? Yes, they do!

Wait, do they geo tag the pics? Yes, they do!

Wait, do they timestamp the pic? Off course they do!

Wait, is there sufficient inventory of pics around known surf spots at any given time? Well, sometimes there is!

Hooray! SwellPhone can rise again! This time though, without the reliance on the crowd to source pics to SwellPhone. People already take thousands of pics of their kids, tanned legs and bellys, gorgeous looking girlfriends/boyfriends, and let’s not forget the empty beer bottles stuck in the sand. All of them have some view of the surf in the background. For us surfers, this is the foreground…

So there you have it. SwellPhone is on again. The site isn’t awesome looking, I know, I suck at design… But I developed phonegap apps for both iOS and Android as well and the UX is much better there.

All in all, it was a lesson to me, and I learnt something. Crowd sourcing can sometimes suck. But that doesn’t mean the data you need is not already sourced in some other manner.


Written by Isaac Trond

September 27, 2012 at 5:51 am

Zemanta: Crowd Sourcing Contextual CPA Advertising

with 90 comments


In my last post, I wrote that CPA needs to be the next revolution in online advertising. I stressed that because CPA is the most efficient pricing method for advertisers, a wide adoption of CPA by publishers will drive the entire online advertising industry forward simply by helping it to gain market share over offline advertising. This is especially true in the current economic downturn, when advertising performance is becoming more and more significant.

I suggested that due to the lack of a scalable system that deploys CPA ads on publishers’ sites in a way that will generate high returns, CPA is therefor not widely adopted by publishers. A lot of innovation is required in order to make it scale.

However, I didn’t suggest any real innovative solution. I still don’t have a suggestion for a solution. But, I think I just found a company that does.  Let’s take a look:

Effective CPA Implementation

CPA can be used very effectively as of today, but it doesn’t scale for most publishers. Filling pre-allocated ad spaces with graphical or textual CPA ads, contextual as they may be, simply doesn’t do the trick. Successful CPA publishers will tell you that the highest eCPM is gained when they manually embed a minimal number of text links and banners that link to the single most relevant product, in the right spot within the content.

Any solution for scaling CPA must answer these three questions with great accuracy, for every web page on which it is implemented:

  • What product is the most relevant to be sold on the page?
  • Where is the best location to link to the product within the content?
  • Which linking method is the best for the product?

What product is the most relevant to be sold on the page?

Things that are very obvious to the human mind are not always so obvious to a machine. While a machine may be able to suggest some products after extracting keywords from a given text, it has no way of determining which of those keywords is a relevant product to sell. A machine can’t understand tone, humor and cynicism from the text. It can hardly tell if a product is mentioned in a positive or negative manner. Furthermore, it’s even harder for the machine to name a single product with the best chance of being sold on the web page.

In order for a machine to do all these things, it will need to understand the context of the page. One can suggest that Google does this. After all, they run the best contextual advertising system out there. But even Google is limited. And the fact is, their system is not so contextual.

Google’s alleged contextual abilities are a derivative of its great search technology. Search technology is all about matching search queries to indexed pages and assigning a score to each match. When AdSense ads are embedded on a web page, the relevancy isn’t gained by understanding what the page is about, rather it is achieved through matching ads to pages as if the ads themselves were search queries. That’s about all that Google’s technology does.

In a sense, Google doesn’t try to find the most relevant ads for any web page. On the contrary – it finds the most relevant pages for any given ad.

If Google can’t deliver the most relevant product right away, then there is still work to do.  It seems like we need a whole new technology in order find the single most relevant product on a web page, something which is more contextual in its nature, and not based on search technology.

Where is the best location to link to the product within the content?

Let’s say we have a machine that understands what product to link to. Now, where should it embed the link? It’s a difficult decision for a machine, even harder than finding the relevant product itself.

It’s one thing to match ads to web pages (or web pages to ads) and fill pre-allocated ad spaces, but it’s another thing to actually allocate the ad space based on context. And we pretty much understand by now that even Google can’t do this.

This challenge doesn’t stop others from trying. In-text advertising is a somewhat new approach, implemented by companies like VibrantMedia, Kontera and even Amazon. All based on the assumption that a good link from within the content is worth more than a bunch of ads around it. Yet, those companies also don’t have the technology to determine the best spot for the link to be embedded.  All they do is turn extracted keywords into links. Those are not necessarily the right words to link, and not necessarily in the right spot. Again, we see that technology can not yet deliver what an advertiser needs, in this case placement.

Which linking method is the best for the product?

Is this a product that requires a visual instead of a text link? And if so, which of the available visuals will perform better? Surely, no machine can provide the answer, yet…

Crowd Sourcing

It seems to me like machine scalable CPA is a romantic idea that belongs to the future. The requirements are just to much for the current technology. Nevertheless, we need a solution now. So what can be done? Crowd sourcing.

If the technology doesn’t exist yet, there is no other way to achieve scalable CPA but to outsource the job to people. This has already been done in other areas of online activity, for example Digg, Delicious and uTest.

How can we use the crowd for contextual CPA? Using Mechanical Turk? Perhaps. However, I believe Zemanta offers a better and more organic way.

Zemanta Crowd Sources Contextual CPA Advertising

Since every page that contains content is generated by humans, what could be more natural than outsourcing the contextual CPA ad embedding to those humans – the authors themselves?

I can think of one problem; authors are not always commercial savvy. They are mostly concerned with writing and not with marketing.

What Zemanta does, however, is provide them with a tool to enrich their content with tags, images, links and more. Zemanta integrates smoothly into the author’s domain and provides them with a great value. What if  Zemanta offers some CPA links and banners in addition their other offerings of pictures, links and tags? And what if those links and banners would be seamlessly merged into Zemanta, thus providing a way for the author to integrate CPA in a truly organic fashion?

In an earlier post, I suggested that Zemanta would eventually look for revenue from the point of content consuming, despite that they recently announced a paid API model. Andraz Tori, a co-founder of Zemanta supports my assumption with his comment on CenterNetworks:

But we are definitely working on monetization.
For example when suggest link to Amazon and you specified your Affiliate ID, we insert it. If you haven’t specified it, we insert our own.

Presently, authors from around the world are using Zemanta and organically embedding CPA Amazon ads within their content. They embed ads to the right product, they embed them in the right spots within the content, and they decide whether or not to add a textual link or a product image. Best of all, the authors don’t do it with the intention of selling. They are doing it with the intention of increasing the value of their own content. As a result, ads are organically and naturally embedded within their content.

Of course this is just the tip of the iceberg. Those are just Amazon ads for now, and Zemanta’s method of revenue sharing is not yet solid. However, they have just recently started – and as far as I can see, Zemanta provides a great service, and it’s byproduct is crowd sourcing of contextual CPA advertising.

If that’s not CPA innovation, then what is?

Written by Isaac Trond

March 5, 2009 at 1:42 pm