These texts was originally published in Spectre of AI, the fifth issue of spheres: Journal for Digital Cultures.

The first text is a technical introduction to an open source recommended system and the second is a broader, more philosophical, reading of this software.

Part 1: Source Code

Introduction

This text aims to explain some of the source code of the open source recommender system LightFM. This piece of software was originally developed by Maciej Kula while working as a data scientist for the online fashion store Lyst, who aggregates millions of products from across the web. It was written with the aim of recommending fashion products from this vast catalogue based to users with few or no prior interactions with Lyst.1 At the time of writing, LightFM is still under active development by Kula with minor contributions from 17 other developers over the past three years. The repository is moderately popular, having been starred by 2,032 GitHub users; 352 users have forked the repository, creating their own version of it that they can modify.2 Users have submitted 233 issues, such as error reports and feature requests to LightFM over the course of its existence, which suggests a modest but active community of users.3 To put these numbers in perspective, the most popular machine learning framework on Github, Tensorflow, has been starred 113,569 times and forked 69,233 times with 14,306 issues submitted.4

While the theoretical text that accompanies this one addresses aspects of machine learning in LightFM, none of the source code quoted here actually does any machine learning. Instead, the examples here are chosen to demonstrate how an already trained model is processed so as to arrive at recommendations. The machine learning aspect of LightFM can be briefly explicated using Tim Mitchell’s much-cited definition: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P improves with experience E.”5 Task T, in this case, is recommending products to users that they are likely to buy. E is historical data on interactions between users and products as well as metadata about those users and products. P is the extent to which the model’s predictions match actual historical user-item interactions

In practice, P is represented by a number produced by a loss function (also known as cost function or objective function) that outputs higher numbers for poorly performing models whose predictions don’t match historical data. To arrive at an effective model, P needs to be minimised through some usually iterative process, which in the case of LightFM is a form of gradient descent.6  Gradient descent begins with a model with random, fairly arbitrary parameters and repeatedly tests the model, each time changing the parameters slightly with the aim of reducing the model loss (the number output by the loss function) and eventually reaching an optimal set of parameters.7 The parameters of LightFM’s model are embedding vectors for each feature or category that may be applied to a user or item; these are discussed in greater detail below. Returning to Mitchel’s definition: as gradient descent only optimises the model based on the historical data available, it is clear that up to a certain point, LightFM is likely to produce more relevant recommendations (T) if it is trained using a larger and presumably more representative set of test and training data (E).

The below excerpts of source code are taken from a file in the LightFM Git repository named _lightfm_fast.pyx.template. This file defines most of the actual number-crunching carried out by the LightFM recommender and is written in Cython, a special form of Python that works like a template for generating C code. The file contains 1,386 lines of Cython and is used to generate up to 30,720 lines of C. While verbose to us, this generated C code is compiled into even more prolix machine code which computers execute much faster than Python.  Both examples are function definitions; they start with the word ‘cdef’ which in this case indicates that the definition of a function that can be compiled into C is to follow. Functions take input data in the form of one or more arguments and perform some computation using these data. They either modify the data that is passed into them or output some new data derived from this input. In the case of compute_representation, the word ‘void’ precedes the function name, indicating that the function outputs or returns nothing and would be expected to do something to its input. In the second example the word ‘flt’ (an alias for ‘float’) precedes compute_prediction_from_repr, meaning that it is expected to return a floating point number. For brevity’s sake, a floating point number is basically a decimal like 1.6.

Function 1: compute_representation8

This function is responsible for taking data about users (e.g. customers) and items (e.g. clothes or films) and producing latent representations of them. These representations can be used by the second function ‘compute_prediction_from_repr’ to predict the level of affinity between these users and items.

cdef inline void compute_representation(CSRMatrix features,
                                        flt[:, ::1] feature_embeddings,
                                        flt[::1] feature_biases,
                                        FastLightFM lightfm,
                                        int row_id,
                                        double scale,
                                        flt *representation) nogil:

The comma-separated lines in parentheses following the name of the function above are parameter declarations; these determine what data or arguments can be passed into the function. The first part of each parameter declaration can be thought of as the type of the expected argument and the second part is the name used to reference it in the body of the function. Here is an explanation of each of the parameters:

 

  1. features: an object belonging to the class ‘CSRMatrix’. Matrices are like tables, or row-column addressable grids, whose cells contain numbers. CSR matrices offer a way of storing matrices in which most of the numbers are zero; in other words: when the matrices are sparse. In this case – sticking with the table analogy – the rows correspond to users or items and the columns correspond to features of these users or items. If the items are films, each feature could be a genre. Cells in the table might contain 1 if the film (row) belonged to genre (column) and 0 if it didn’t. Each row in the matrix can be taken as a vector representing the corresponding film.
  2. feature_embeddings: a two-dimensional array, which is also like a table with rows and columns. This array contains the embeddings for features that have been learned from training data using gradient descent, as discussed above. This array has a row for each feature and a column for each dimension of the features’ embeddings. These feature embedding rows can be thought of as vectors that contain information about how similar each feature is to others based on shared positive user interactions such as favourites and purchases. Vectors are like arrows with a direction and a magnitude or length. The row vectors of the ‘feature_embeddings’ array are represented by their Cartesian coordinates, such that if each row contained two numbers, the vectors would be two-dimensional and the first number might specify the horizontal position of the end of the vector (the tip of the arrow) and the second number the vertical position. If two features (e.g. ‘action’ and ‘adventure’ or ‘black’ and ‘dress’) are shared by items bought by the same 1,000 users, their embeddings will be similar; they will point in a similar direction. To illustrate: the similarity of two two-dimensional feature embeddings could be worked out by plotting them both on a sheet of paper as arrows originating from the same point and measuring the shortest angle between them using a protractor; the smaller this angle, the greater the affinity between the features.
  3. feature_biases: a one-dimensional array of floating point numbers. This is like a list of decimal numbers.
  4. lightfm: an object of the class FastLightFM. This object holds information about the state of the recommender model.
  5. row_id: an integer, or whole number, identifying the row in the features matrix that the function should compute a representation for.
  6. Scale: a double-precision floating point number. It is called double-precision because it takes up 64 bits, or binary digits, of memory rather than the standard 32.
  7. representation: reference to a one-dimensional array of floating-point numbers. When executed, the ‘compute_representation’ function modifies the contents of the referenced ‘representation’ array.
"""
    Compute latent representation for row_id.
    The last element of the representation is the bias.
    """

    cdef int i, j, start_index, stop_index, feature
    cdef flt feature_weight

The two lines above are declaring variables, which allow a value of a particular type to be referenced by a name such as ‘start_index’. All of these variables are numbers that are used later in the function.

    start_index = features.get_row_start(row_id)
    stop_index = features.get_row_end(row_id)

Two of the variables declared above are being assigned values that are returned by the ‘get_row_start’ and ‘get_row_end’ functions. These functions are part of the ‘features’ CSRMatrix object and are called methods.

    for i in range(lightfm.no_components + 1):
        representation[i] = 0.0

The above two lines comprise a simple Python for-loop. The loop counts up from zero to the value of ‘lightfm.no_components + 1’ in increments of one, each time setting the value of variable ‘i’ to the current count. For each increment it executes the indented code ‘representation[i] = 0.0’, which sets the ith element of the ‘representation’ array to zero. In effect, it sets every number in ‘representation’ to zero.

 

Note: ‘lightfm.no_components’ is an integer that determines the dimensionality of the features’ latent embeddings. If ‘no_components’ is ‘10’, each feature is represented by a ten-dimensional vector.

    for i in range(start_index, stop_index):
        feature = features.indices[i]
        feature_weight = features.data[i] * scale

This for-loop is slightly different, it counts up from ‘start_index’ to ‘stop_index’, setting ‘i’ to the current count each time. It also executes five lines of indented code upon each iteration, including another nested for-loop. The following two lines set the ‘feature’ and ‘feature_weight’ variables to the appropriate values from the CSR matrix object. ‘feature’ is set to the index of the feature ‘i’, as stored in the ‘feature_embeddings’ array. ‘feature_weight’ is set to a value that indicates whether this particular feature belongs to the user or item the ‘compute_repr’ function computing is a representation for; this is probably ‘1’ if the feature belongs and ‘0’ if it doesn’t.

        for j in range(lightfm.no_components):

            representation[j] += feature_weight * feature_embeddings[feature, j]

This nested for-loop counts from zero to the number of components (the dimensionality of the feature embedding) in increments of one and sets j to the current count.  ‘feature_weight’ will maintain the same value for the duration of the loop: either ‘1’ or ‘0’.  This means that either the entire feature embedding will be added to the ‘representation’ array or none of it.

All this loop is doing is adding together the latent representations of the features of a given item or user. As Maciej Kula puts it: “The representation for denim jacket is simply a sum of the representation of denim and the representation of jacket; the representation for a female user from the US is a sum of the representations of US and female users.”9

        representation[lightfm.no_components] += feature_weight * feature_biases[feature]

If the user or item has the feature ‘features.data[i]’, that feature’s bias is added to the final bias element of the ‘representation’ array.

Function 2: compute_prediction_from_repr10

This function is responsible for predicting how likely a user is to be interested in an item.

cdef inline flt compute_prediction_from_repr(flt *user_repr,
                                             flt *item_repr,
                                             int no_components) nogil:

There are only three parameters this time:

  1. user_repr: a reference to a one-dimensional array of floating point numbers. Its contents will have been produced by the ‘compute_representation’ function above based on the features and other data for a user.
  2. item_repr: a reference to a representation of the features of a given item, also produced by the ‘compute_representation’ function.
  3. no_components: the number of dimensions of the representation vectors.
    cdef int i
    cdef flt result

    # Biases
    result = user_repr[no_components] + item_repr[no_components]

The variable ‘result’ is set to the sum of the bias component of the latent representations for both the user and item.

    # Latent factor dot product
    for i in range(no_components):
        result += user_repr[i] * item_repr[i]

This loop adds the dot product or inner product of the user’s and item’s representation vectors to the ‘result’ vector. The dot product of two vectors is the sum of the results of multiplying each part of the first vector by the corresponding part of the second vector, as can be seen in the loop: ‘user_repr[i] * item_repr[i]’.

The dot product is a single number that varies with the difference in direction between the embedding vectors ‘user_repr’ and ‘item_repr’ and is greater if they are facing the same way.  As these embeddings have been calculated based on known interactions between users and items, the dot product gives an indication of how likely any user is to interact positively with any item.

    return result

The function returns the dot product added to the sum of the item and user’s biases. These biases are calculated in the ‘compute_representation’ function and are the sums of the biases of the item and user’s active feature. This effectively gives LightFM the ability to treat some features as more important than others.

 

1.   Cp. Maciej Kula, “Metadata Embeddings for User and Item Cold-start Recommendations”, paper presented in the 2nd Workshop on New Trends on Content-Based Recommender Systems co-located with 9th ACM Conference on Recommender Systems, 2015. Available at: http://ceur-ws.org/Vol-1448/paper4.pdf [accessed July 27, 2018].
2.   Cp. Maciej Kula, “GitHub – lyst/lightfm: A Python implementation of LightFM, a hybrid recommendation algorithm”, posted to Github. Available at: https://github.com/lyst/lightfm [accessed November 4, 2018].
3.   Cp. Maciej Kula, “Issues – lyst/lightfm – GitHub”, posted to Github. Available at: https://github.com/lyst/lightfm/issues [accessed November 4, 2018].
4.   Cp. TensorFlow, “GitHub – tensorflow/tensorflow: An Open Source Machine Learning Framework for Everyone”, posted to Github. Available at: https://github.com/tensorflow/tensorflow [accessed November 4, 2018].
5.   Tom Mitchell, Machine Learning, London, McGraw-Hill, 1997, p. 2.
6.   Cp. Kula, “Metadata Embeddings for User and Item Cold-start Recommendations”.
7.   Cp. Andrew Ng, “Lecture 2.5 — Linear Regression With One Variable | Gradient Descent — [ Andrew Ng]”, lecture posted to Youtube. Available at: https://www.youtube.com/watch?v=F6GSRDoB-Cg [accessed November 4, 2018].
8.   Maciej Kula, “lightfm/_lightfm_fast.pyx.template”, posted to Github, line 287. Availabe at: https://github.com/lyst/lightfm/blob/e12cfc7e5fa09c1694b98acc96af3ed2754646ae/lightfm/_lightfm_fast.pyx.template#L287  [accessed July 25, 2018].
9.   Cp. Kula, “Metadata Embeddings for User and Item Cold-start Recommendations”.
10.   Maciej Kula, “lightfm/_lightfm_fast.pyx.template”, posted to Github, line 320. Available at:  https://github.com/lyst/lightfm/blob/e12cfc7e5fa09c1694b98acc96af3ed2754646ae/lightfm/_lightfm_fast.pyx.template#L320  [accessed July 25, 2018].

 

Part 2: Machine Learning and the Machinic Unconscious

Introduction

In this text I set out to critically examine part of the source code of the recommender system LightFM. To this end I deploy the micropolitics developed by Gilles Deleuze and Félix Guattari, as well as Andrew Goffey’s and Maurizio Lazzarato’s readings of their micropolitical ideas. I build an argument around Guattari’s suggestion that subjectivity is not solely the product of human brains and bodies, and that the technical machines of computation intersperse with what might be thought of as human in the production of subjectivity. Drawing upon contemporary approaches to the nonhuman, machine learning and planetary-scale computation, I develop a framework that situates the recommender system in assemblages of self-ordering matter and links it to historical practices of control through tabulation. While I acknowledge the power of source code in that it always carries the potential for control, in this reading, I impute greater agency to computation. In what follows, rather than reducing it to an algorithm, I attempt to address the recommender system as manifold: a producer of subjectivity, a resident of planet-spanning cloud computing infrastructures, a conveyor of inscrutable semiotics and a site of predictive control.

The Interstices of Human and Technical Machine

In addition to addressing LightFM at the level of its technical workings and infrastructural instantiations, I aim to conceptualise the role these workings play in the ongoing crystallisation of power relations. Deleuze and Guattari offer a useful starting point: everything, for them, is political and every politics is “simultaneously a micropolitics and a macropolitics.”1 They offer the example of “aggregates of the perception or feeling type” and maintain that “their molar organisation, their rigid segmentarity, does not preclude the existence of an entire world of micropercepts, fine segmentations that grasp or experience different things, are distributed and operate differently.”2 Much as there are multitudes of molecular variations that escape dominant molar understandings of perception, I argue that there are sinuous historical and machinic paths that criss-cross and circle the linear division of human and technical machine. Further, “there is a double reciprocal dependency between”3 the molar and molecular. Just as multifarious populations are shaped by molar classes, the intertwining histories and relations of humans and technical machines are moulded by the dichotomy’s rigid structure. As a result, some theorisation of humans and technical machines privilege one side of this dichotomy. For instance, in Andy Clark’s notion of the extended mind4 and Marshall McLuhan’s conception of media as extensions of man5, media and technical machines are rendered as mere prostheses to human cognition or perception. I favour neither this nor the opposite molar approach of imputing disproportionate agency to technical machines; there is more to explore at the molecular level, in what we might think of as a machinic unconscious.

In his book The Machinic Unconscious, Guattari posits “a consciousness independent of individuated subjectivity [that] could manifest itself as a component in the assemblages of enunciation, ‘mixing’ social, technical and data processing machines with human subjectivity, but could also manifest itself in purely machinic assemblages, for example in completely automated and computerized systems”6. While this partly or fully nonhuman consciousness is central to my analysis of LightFM, my reading does not impute consciousness to non-living things. Were it to do so, I might be seen to espouse a form of panpsychism which views “mind [as] a fundamental property of matter itself”7; I believe that this position and the questions it raises are beyond the scope of this text. I limit myself to tracing through these “social, technical and data processing machines”, semiotic processes that reside in the barren and unmapped hinterlands of human subjectivity. Following Maurizio Lazzarato, I identify “mixed” and fully computational enunciatory assemblages as proto-subjectivities. These belong to the register of “non-representational and asignifying”8 semiotics: a sign system that operates below the threshold of individuated subjectivity and one that favours pattern over meaning.

Individuated subjectivity, for Lazzarato, is not the sum total of subjectivity; it is produced by the macropolitical process of social subjection that assigns subjects “an identity, a sex, a body, a profession, a nationality, and so on.”9 Conversely, proto-subjectivity is subjectivity produced by the micropolitical process of machinic enslavement. It comprises “a multiplicity of human and nonhuman subjectivities and proto-subjectivities”10; it arises from the micropolitical process of machinic enslavement, that “dismantles the individuated subject, […] acting on both the pre-individual and supra-individual levels.”11 Individuated subjectivity is representational, autobiographical and identitarian, giving rise to clearly delineated subject who acts instrumentally upon external objects.12 Proto-subjectivity is non-representational and pre-individual, capable of emerging in all autopoietic machinic systems.13 Critically engaging with proto-subjectivity is by necessity a speculative endeavour calling for non-representational thinking, which Nigel Thrift identifies with the “anti-biographical and pre-individual”14, with a “vast spillage of things” and with “affect and sensation”15. LightFM’s source code is an uneasy compromise between the representational, signifying semiotics of individuated subjects and the non-representational asignifying semiotics of technical machines. As such, in this text I must go further than merely explicating the procedures of computation, as I do in the companion text, and address computation in terms of its material instantiations, its histories and its production of affect.

Approaching the Machinic Unconscious

Andrew Goffey suggested in a recent lecture on the micropolitics of software, that a technological or machinic unconscious might be one that “crosses the histories of programming practices and their shifting relations to the infrastructures that they produce and are produced by – [disclosing] the fragmented possibilities of a different relationship to power.”16 What conceptual tools might help us figure this relationship, to think through these histories and infrastructures, mixed assemblages and proto-subjectivities? We can start by turning our attention, as some contemporary philosophy does, to the nonhuman. One example is object-oriented ontology, a branch of speculative realism that includes thinkers such as Timothy Morton, who would consider a poem, and presumably a piece of code, a nonhuman agent.17 However, as Jane Bennet notes, object-oriented philosophers refuse the label “materialist”, viewing objects as isolated entities withdrawn from other things.18 Such a position leaves little room for the molecular, micropolitical processes I am concerned with here, many of which are relational or take place close to the undifferentiated level of the machinic phylum. Another more established alternative is Actor-Network Theory, which is primarily concerned with the observation of connections between agents, both human and nonhuman.19 While an emphasis on nonhuman objects and nonhuman agency is useful – by building on the materialism of Deleuze and Guattari and retaining reference to subjectivity,20 Braidotti’s neo-vitalist materialism provides a better ground for my argument. For her, “all human and non-human entities are nomadic subjects-in-process, in perpetual motion, immanent to the vitality of self-ordering matter.”21 We will return to Braidotti later; for now it suffices that autopoietic matter provides a solid ontological substrate for delineating proto-subjectivities.

We now turn to the subject matter of this text, which Adrian Mackenzie addresses in some depth in his recent book Machine Learners. Mackenzie’s titular machine learner can be both human and nonhuman, or indeed constitute “human-machine relations”22. These human-machine relations are the sites of proto-subjectivities, mixed semiotic assemblages that “inhabit a vectorised space and [whose] operations vectorise data.”23 I place the vectors and matrices of machine learning in a genealogy of tables as technologies that aid in control. This genealogy stretches from ancient Mesopotamia24 and encompasses the introduction of tab keys in typewriters25 and the adoption of punch-card tabulating machines26 in turn-of-the-century bureaucracies as well as the relational databases of the 1960s. It is not a straightforward genealogy, however; as Mackenzie notes, for machine learners, the sheer number of dimensions in vector space can “thwart tabular display” and tables can “change rapidly in scale and sometimes in organisation”27. Drawing on Foucault’s account of tables from different eras, Mackenzie argues that the matrices of machine learning mark a return to the Classical or even pre-Classical tables28 that married heterogeneous elements and were structured according to plural and diverse resemblances.29 For example: a matrix of online purchases would place vectors for hair dryers alongside those for garden ornaments; a machine learner, owing to its profoundly flattened ontology, would subject them to the same computation, tracing diverse resemblances through blind repetition.

How might we conceptualise LightFM if it were deployed on a global cloud platform like Amazon? What if, instead of being trained with the ubiquitous MovieLens 100k dataset, LightFM could vectorise the largest ever accretion of user and product metadata on the planet? To aid in answering this question, I will borrow Benjamin Bratton’s model of planetary-scale computation: the ‘Stack’. Setting aside the geopolitical intricacies of Bratton’s argument, the stack can be thought of as “a vast software/hardware formation, a proto-megastructure built of crisscrossed oceans, layered concrete and fibre optics, urban metal and fleshy fingers”30. I would argue that planetary-scale machine learning sits at the intersection of the material megastructure of the stack and the asignifying semiotic processes of the machinic unconscious. Bratton’s ‘Stack’ is divided into six layers: ‘Earth’, ‘Cloud’, ‘City’, ‘Address’, ‘Interface’ and ‘User’. These can be placed on a vertical spectrum, rising from molecular to molar – from the geological and chemical, through to the computational all the way up to individuated subjectivity. The micropolitical analysis that follows is primarily concerned with the more molecular ‘Cloud’ layer. However, recalling the double reciprocal dependency between micropolitics and macropolitics, in thinking through the computation that works with vectors and matrices of user meta-data in the ‘Cloud’ layer, we may glean insights into how individuated subjectivity is produced in the ‘User’ layer.

Reanimating the Code

Wendy Chun holds that what we call source code “is more properly an undead resource: a writing that can be reanimated at any time, a writing that haunts our executions.”31 I share Chun’s hesitancy to locate agency primarily in source code, which I view as human-readable shorthand with the potential – through multiple translations – to inscribe the palimpsest-like surfaces of computational agency. It is only in that the source code haunts its concrete executions that we can read it micropolitically at all. The terse sentences and mathematical formulae Maciej Kula uses to describe LightFM’s algorithm also haunt these executions, but even more spectrally and tenuously than does the code. To illustrate, the following short passage describes the part of LightFM’s algorithm that the source code examples are responsible for implementing: “The latent representation of user u is given by the sum of its features’ latent vectors […] The model’s prediction for user u and item i is then given by the dot product of user and item representations, adjusted by user and item feature biases”32. Algorithms are divorced from what Goffey calls implementation details: “embodiment in a particular programming language for a particular machine architecture”33. The Cython source code is as close as my method allows me to get to the micropolitics and asignifying semiotics of computation, but how do I approach it? I heed Mark Marino’s warnings against analysing code aside from its “historical, material, social context” and drawing specious analogies between computation and unrelated social practices or cognitive processes.34 Instead, in what follows, I attempt to speculate beyond the text of the code, to the data structures it references, the materiality of its executions and how these relate to power and control.

Mackenzie observes that many machine learners seek to approximate data by plotting lines and curves through it, or dividing it with lines, planes35 and hyperplanes.36 LightFM, however, mainly transforms and compresses a potentially enormous vector space into smaller more easily computable representations, on which it bases its predictions. One of the parameters of the compute_representation function is the embedding vector for a feature, such as a book genre. The embedding is arrived at based on a matrix, a two-dimensional grid of numbers. In this matrix, each of millions of users is assigned a row and each of hundreds of thousands of books a column; each cell where a user and book intersect contains the number “1” if the user has bought the book, otherwise “0”. Now the transformation: from this vast binary matrix an embedding vector of a genre such as ‘software studies’ is produced that points in more or less the same direction as the vectors for other genres of books also bought by software studies enthusiasts. Perhaps the directions of these embedding vectors are among what Mackenzie refers to when he describes the vectorised table as a “machinic process that multiplies and propagates into the world along many diagonal lines.”37 These vectors are arrived at iteratively, through trial and error, in what could be thought of as discretised space and stepwise time. Proto-subjectivities inhabit the discrete space-time of vector computation, just as they reach through a maze of cables, routers and interfaces to the smooth and continuous bodies of users, their unthinking habits and gradations of affective intensity.

A non-representational reading of a function called compute_representation leaves no space for irony. It leaves little space for users, perhaps more for items, although the function makes no distinction. It expects a sequence of ones and zeros that correspond to embedding vectors, one of which may be the embedding for ‘software studies’. It wants a reference to an existing representation, a sequence of floating-point numbers,38 each a word 32 bits or binary digits long, arrayed one after another at a particular numeric address in a memory module in one of thousands of racked servers in a data centre. Execution begins. It marks six 32-bit chunks of memory for later use. It switches context, to the get_row_start function in the features object; this object is an agglomeration of data and executable instructions sprawling across a bristling microscopic patch of physical memory. It steps through each instruction in this foreign function and remembers the result. It switches back and writes the result to start_index, one of the reserved chunks of memory; the same for stop_index. One-by one, all bits in all words in the representation array are set to 0, switched off. It later, cycle-by-cycle, switches some of these bits on, and sometimes off again, endlessly toggling states; often all bits remain unchanged for an entire cycle as it blindly adds zero to each part of the representation.

Planetary-scale Prediction

The notion of deriving a prediction from a representation is firmly situated in the familiar signifying semiotics of individuated subjectivity. And yet compute_prediction_from_repr merely calculates the inner product of two vectors, two sequences of numbers, through brute iteration. Mark Andrejevic observes that it is precisely this lack of understanding, reasoning and intuition that gives data-driven prediction its power39 – though control may be more apt here. Recall the genealogy of the table alluded to earlier; two-dimensional tables such as timetables are a mark of Foucault’s disciplinary society, which Deleuze suggests has been supplanted by a society of control. I argue that this rupture in the table’s genealogy, the expansion and fragmentation of two-dimensional grids into inscrutable vector spaces, mirrors the fission of enclosed individuals into “‘dividuals’ and masses, samples, data, markets or ‘banks’”40. Instead of being placed in a panoptic cell and observed, subjects are divided into row vectors in a panoply of matrices dotted around the ‘Cloud’, and predicted. This control differs from the “purposive influence to a predetermined goal”41 that James Beniger posits as the seeds of the information society; it is closer to what Luciana Parisi and Steve Goodman call “mnemonic control”: “the power to foreclose an uncertain, indeterminate future by producing it in the present”42.

How does machinic prediction at the ‘Cloud’ layer entail control at the ‘User’ layer? In constructing vectors that stand in for users, LightFM may be producing categories of subjects, in that these vectors could coalesce or cluster into molar classes. This is a tendency observed by Braidotti, whereby “the neoliberal system finds ways to capitalize also on the marginal and the molecular formations, recomposing them as multiple molarities (i.e. billions of Facebook pages).”43 This can also be figured as purposive movement within the autopoietic matter that makes up the ‘Stack’. Computational proto-subjectivities and their asignifying semiotic chains snake through the ‘Cloud’, ‘City’, ‘Address’ and ‘Interface’ layers, terminating in individuated subjects at the ‘User’ layer.44 How might proto-subjectivities apprehend themselves to these subjects? Perhaps as the imperceptible background hum of the ontological power of the future in the present.45 If I were to speculate: a subject may feel itself to be acting on blind compulsion. Hunched over a smartphone, through a fog of information fatigue, they may be faintly aware of being nudged toward certain actions, certain products, certain cultural content.

Conclusion

In this short text, I have gone some way in posing, if not answering the question of how molecular, machinic processes in a recommender system like LightFM function and relate to power over individuated subjects. I have hinted at how computation may embody a certain kind of agency that feeds into the production of users as subjects. Bratton’s model of The Stack aided me in figuring the impingement of proto-subjectivities on molar aggregates such as subjects and users. The idea I have sketched out, of vectorisation and predictive control as a rupture in the genealogy of the table may be an avenue for more extensive research. Although I never primarily attributed agency to the code itself, I could only circumvent the limits of this technical text through speculation. A more empirical approach like Actor-Network Theory may have better elucidated some micropolitical aspects of LightFM, in mapping concrete connections between agents such as computers, programmers and users.

 

1.   Gilles Deleuze and Félix Guattari, A Thousand Plateaus: Capitalism and Schizophrenia, translated by Brian Massumi, London, Bloomsbury Academic, 2013 [1980], p. 249.
2.   Ibid.
3.   Ibid.
4.   Cp. Andrew Clark and David Chalmers, “The Extended Mind”, Analysis, 58 (1), 1998, pp. 7–19.
5.   Cp. Marshall McLuhan, Understanding Media: The Extensions of Man, Corte Madera, CA, Gingko Press Routledge, 2003 [1964].
6.   Félix Guattari, The Machinic Unconscious, Cambridge, MA, Semiotext(e), 2011 [1979], p. 221.
7.   Steven Shavero, “Consequences of Panpsychism”, in Richard Grusin (ed.), The Nonhuman Turn, Minneapolis, University of Minnesota Press, pp. 19–44, here: p. 20.
8.   Maurizio Lazzarato, Signs and Machines, Los Angeles, Semiotext(e), 2014, p. 25.
9.   Ibid., p. 12.
10.   Ibid., p. 34.
11.   Ibid., p. 12.
12.   Cp. Ibid., p. 12.
13.   Cp. Ibid., p. 80.
14.   Nigel Thrift, Non-Representational Theory: Space | Politics | Affect, London, Routledge, 2008, p. 7.
15.   Ibid. p. 10.
16.   Andrew Goffey, “Andrew Goffey – Micropolitics of Software”, lecture at Subjectivity, Arts and Data, Department of Media Arts at Royal Holloway, University of London, 2018. Available at: https://www.youtube.com/watch?v=9bqxsmFo72k [accessed July 25, 2018].
17.   Cp. Jane Bennet, “Systems and Things”, in Richard Grusin (ed.), The Nonhuman Turn, Minneapolis, University of Minnesota Press, pp. 223–240, here: p. 234.
18.   Cp. Ibid., p. 226.
19.   Cp. Bruno Latour, Reassembling the Social, Oxford, Oxford University Press, 2008.
20.   Cp. Rosi Braidotti, “A Theoretical Framework for the Critical Posthumanities”, Theory, Culture & Society, 2018, pp. 1–31. Available at: http://journals.sagepub.com/doi/full/10.1177/0263276418771486 [accessed March 28, 2019].
21.   Ibid.
22.   Adrian Mackenzie, Machine Learners, Cambridge, MA, The MIT Press, 2017, p. 6.
23.   Ibid., p. 51.
24.   Cp. Francis Marchese, quoted in Mackenzie, Machine Learners, p. 56.
25.   Cp. Susanne Yates, Control Through Communication, Baltimore, Johns Hopkins University Press, 1993, p. 80.
26.   Cp. James Beniger, The Control Revolution, Cambridge, MA, Harvard University Press, 1993, p 80.
27.   Mackenzie, Machine Learners, p. 58.
28.   Cp. Ibid.
29.   Cp. Ibid., p. 56.
30.   Benjamin Bratton, The Stack: on Software and Sovereignty, London, The MIT Press, 2015, p. 52.
31.   Wendy Chun, “Wendy Chun – Critical Code Studies”, lecture at the University of Southern California, 2010. Available at: https://vimeo.com/163282630 [accessed August 25, 2018].
32.   Maciej Kula, “Metadata Embeddings for User and Item Cold-start Recommendations”, paper presented in the second workshop on New Trends on Content-Based Recommender Systems co-located with 9th ACM Conference on Recommender Systems, 2015. Available at: http://ceur-ws.org/Vol-1448/paper4.pdf [accessed July 27, 2018].
33.   Andrew Goffey, “Algorithm”, in Matthew Fuller (ed.), Software Studies: A Lexicon, London, The MIT Press, 2018, pp. 15–20, here: p. 15.
34.   Mark Marino, “Critical Code Studies”, Electronic Book Review, 2006. Available at: http://www.electronicbookreview.com/thread/electropoetics/codology [accessed July 25, 2018].
35.   Just as a line is a straight one-dimensional geometric object that extends infinitely in both directions, a plane is a flat two-dimensional object all of whose edges extend infinitely. A point has zero dimensions and can be used to divide a one-dimensional line into two line segments, which can represent classes in the case of a machine learning classifier working with one parameter. A line can be similarly used to divide a two-dimensional parameter space into two classes. For example, if one parameter was human height and the other weight, and the data were plotted on a scatter graph, a straight line could be drawn as a boundary to distinguish the overweight from the non-overweight. The same applies to a plane and a three-dimensional parameter space. As the geometric rules of a plane can be abstracted to arbitrarily high dimensional spaces, a hyperplane of one fewer dimensions than the parameter space can always be used to divide that space. Curved surfaces can be similarly used to classify data at more than three.
36.   Mackenzie, Machine Learners, p. 212.
37.   Ibid., p. 73.
38.   An IEEE-754 32-bit floating number comprises three elements: a sign, exponent and fraction. A decimal number can be derived from the floating point representation in the following way: + or – fraction x 2 exponent. The sign is a single bit that indicates whether the number is positive; the fraction is a 23-bit integer and the exponent an 8-bit integer.
39.   Cp. Mark Andrejevic, Infoglut: how too much information is changing the way we think and know, New York, Routledge, 2013, p. 21.
40.   Gilles Deleuze, “Postscript on the Societies of Control”, in October, 59, 1992, pp. 3–7, here: p. 5.
41.   Beniger, The Control Revolution, p. 36.
42.   Luciana Parisi and Steve Goodman, “Mnmonic Control”, in Patricia Clough and Craig Willse (eds.), Beyond Biopolitics: Essays on the Governance of Life and Death, Durham NC, Duke University Press, 2011, pp. 163-176, here: p. 167.
43.   Braidotti, “A Theoretical Framework for the Critical Posthumanities”, p. 15.
44.   Cp. Guattari, The Machinic Unconscious, p. 51.
45.   Cp. Mark Hansen, “Our Predictive Condition”, in Richard Grusin (ed.), The Nonhuman Turn, Minneapolis, University of Minnesota Press, pp. 101-138, here: p. 125.