Twitter Example

This section contains examples of analyzing a Twitter feed using IoTPy. You must establish your own credentials with Twitter to acquire data. The Twitter analysis code was written by Tommy Hannan after his first year at MIT. The code is in IoTPy/examples/Twitter/.

Example 1

This simple example is an application consisting of a single process. It has a source that acquires tweets from Twitter and then does simple analysis of each tweet.

The code consists of a single source, a compute function — compute_func — and the standard way of creating single process applications. The function twitter_analysis gets tweets specified by trackwords and executes a function, tweet_analyzer, on each tweet.

twitter_to_stream is a source described in the source section of these pages. It’s code is found in IoTPy/examples/Twitter/twitter.py

def twitter_analysis(
        consumer_key, consumer_secret,
        access_token, access_token_secret,
        trackwords, tweet_analyzer, num_tweets):
    # SOURCE
    def source(out_stream):
        return twitter_to_stream(
            consumer_key, consumer_secret,
            access_token, access_token_secret,
            trackwords, out_stream, num_tweets)
    # COMPUTATIONAL FUNCTION
    def compute_func(in_streams, out_streams):
        sink_element(func=tweet_analyzer,
                     in_stream=in_streams[0])
    # PROCESSES
    proc = shared_memory_process(
        compute_func=compute_func,
        in_stream_names=['in'],
        out_stream_names=[],
        connect_sources=[('in', source)],
        connect_actuators=[],
        name='proc')
    # CREATE AND RUN MULTIPROCESS APPLICATION
    mp = Multiprocess(processes=[proc], connections=[])
    mp.run()

compute_func, in the above code, consists of a single agent — sink_element — which applies the function tweet_analyzer to each element of its input stream.

Two examples of tweet_analyzer are given in the code. In the file twitter.py, tweet_analyzer prints tweet summaries, and in the file twitter_analysis.py, tweet_analyzer analyzes the sentiments of tweets, and the number of followers of tweeters and the number of retweets.

Example 2: Multiprocess implementation

This is the same as example 1 except that the component functions in twitter_analysis are implemented in four shared-memory processes.

Process 0

This process merely acquires tweets from Twitter and passes the tweets on to other processes. This process has a single input called ‘in’ and a single output called ‘out’.

   def source(out_stream):
        return twitter_to_stream(
            consumer_key, consumer_secret,
            access_token, access_token_secret,
            trackwords, out_stream, num_steps)
    def compute_func_0(in_streams, out_streams):
        map_element(lambda x: x, in_stream=in_streams[0],
                    out_stream=out_streams[0])
    proc_0 = shared_memory_process(
        compute_func=compute_func_0,
        in_stream_names=['in'],
        out_stream_names=['out'],
        connect_sources=[('in', source)],
        name='proc_0')
   def compute_func_1(in_streams, out_streams):
        def get_sentiment(tweet):
            tweet_text = get_text(tweet)
            sentiment_of_tweet = sentiment_of_text(tweet_text)     
            return (tweet_text, sentiment_of_tweet)
        map_element(
            func=get_sentiment,
            in_stream=in_streams[0],
            out_stream=out_streams[0])
    proc_1 = shared_memory_process(
        compute_func=compute_func_1,
        in_stream_names=['in'],
        out_stream_names=['out'],
        connect_sources=[],
        name='proc_1') 

The process’ compute function merely copies the input stream to the output stream. The source acquires data from Twitter. The source is connected to the input called ‘in’. The connections to the output stream called ‘out’ will be specified later.

Process 1

This process has a single input and a single output. It computes the sentiment of the tweets that arrives in the input. The connections to the input and output are specified later.

   def compute_func_1(in_streams, out_streams):
        def get_sentiment(tweet):
            tweet_text = get_text(tweet)
            sentiment_of_tweet = sentiment_of_text(tweet_text)     
            return (tweet_text, sentiment_of_tweet)
        map_element(
            func=get_sentiment,
            in_stream=in_streams[0],
            out_stream=out_streams[0])
    proc_1 = shared_memory_process(
        compute_func=compute_func_1,
        in_stream_names=['in'],
        out_stream_names=['out'],
        connect_sources=[],
        name='proc_1')

Process 2

Process 2 is identical to process 1 except that it computes the followers and retweets of a tweet rather than its sentiment. The only difference is that the func term in map_element is followers_and_retweets_of_tweet rather than sentiment_of_text. The code is not given here, for brevity.

Process 3

Process 3 has two inputs called in_1 and n_2. This process has no outputs. The compute function of the process has two agents: one agent (zip_stream) zips the two input streams and the second agent (stream_to_file) writes the result of the zip on to a file.

    def compute_func_3(in_streams, out_streams):
        t = Stream()
        zip_stream(in_streams, out_stream=t)
        stream_to_file(in_stream=t, filename='result.dat')
    proc_3 = shared_memory_process(
        compute_func=compute_func_3,
        in_stream_names=['in_1', 'in_2'],
        out_stream_names=[],
        connect_sources=[],
        name='proc_3')

Connect Processes

    mp = Multiprocess(
        processes=[proc_0, proc_1, proc_2, proc_3],
        connections=[(proc_0, 'out', proc_1, 'in'),
                     (proc_0, 'out', proc_2, 'in'),
                     (proc_1, 'out', proc_3, 'in_1'),
                     (proc_2, 'out', proc_3, 'in_2')])
    mp.run()

Finally we connect all the processes together and run the multiprocess application. The output called ‘out’ of process 0 is connected to the input called ‘in’ of process 1 and of process 2. The outputs called ‘out’ of processes 1 and 2 are connected to the inputs called in_1 and in_2 (respectively) of process 3.