I've been naughty (and busy). I skipped two project milestones: 

    2.  Working prototype for output.
    3. Working front-ends for both input and output (searching for products and scraping recommendations).

The idea was the each user could have multiple research profiles that they trained differently and compared. 
IN turn, each research profile could carry out multiple searches on Amazon to influence recommendations.

I was at milestone 3. I had a stable version of RARS running for the past month with thousands of recommendations stored in it for the benefit of those marking my MA final project.

The logic (partly) represented by this chart was working! ( see full size)

If it was possible, milestone 4 was going to be beta testing the application with hapless members of the public. After receiving my marks from my MA, I decided to tweak RARS a little. Everything promptly broke; the database couldn't talk to the web application which in turn couldn't talk to the web server. I rather rashly decided to completelty terminate the cloud server on which RARS was running, thus losing all the data I had gathered.

My Game of Cat-Duck and Mouse with Amazon

I had put a few lines in my snu-snu authentication source code to print the HTML of the web page at key junctures. This made it possible to determine whether the headless Chrome instance was reaching the right pages when attempting to access Amazon.

Logging the source-code also led to this chuckle-inducing discovery

I spun up a new cloud server on Amazon Web Services and had to start from scratch. When I attempted to use RARS to log into the old dummy amazon accounts I had used, I was met with this:

Take it from me: Combing through HTML source code is a fun Sunday Afternoon activity

This page was new. I had already encountered two others:

  1. One telling me that if I wanted to carry out automated interaction with Amazon, I could ask them and/or use their API.
  2. One asking for a captcha after an attempted log-in.

These two were fairly easily to deal with. With the first, I would just assign a new elastic IP to the cloud server. (Yes, that's right: on top of free cloud computing for a year, Amazon give you random IP addresses that their e-commerce platform doesn't seem to recognise as belonging to them.) With the second, I would either abandon the account or log into it over TOR and fill in the captcha, restoring Amazon's trust in the account.

This third roadblock was insurmountable. I tried rotating IP addresses. I tried generating a new email address. No joy. It was possible that the fact that so many suspicious accounts were created using email addresses at my personal domain name had led to it being blacklisted. I considered trying to find a service that would let me create email accounts without a phone number over TOR. Yandex used to; a few others still might.... At this point, I felt that trying to find another email provider would be about as effective as a broom in keeping the tide at bay. The only other option would be to fetch the relevent emails for the given account over IMAP, extract the code and get Chrome to type it into the page. Even if I wanted to, my day job wouldn't allow me to do this within a reasonable timeframe.

I fell at milestone 3.141592653589793...

Where from Here?

I knew at the beginning of this project that even if it was viable to create a wrapper around fake amazon accounts for sociological research, it was never going to be scalable. Why did it do it? For one thing, I was doing an MA at Goldsmiths;  the normal logics didn't apply.  It was the right time to take risks and expriement.

There are a handful of things I can do with the few thousand lines of code that I wrote for this project, and the skills I learned along the way.

  • Adapt snu-snu into a browser plugin that spams Amazon with obfuscatory searches, rendering its recommendations meaningless.
  • Create a browser-automation based library that provides a high-level interface for interacting with Facebook in ways not supported by their API. This could be used to automate your own facebook behaviour or to make a bot. It would be interesting to teach neural nets optimal social networking behaviour.
  • Make more web applications that allow users to interact with the rest of the internet in complex novel ways.