![]() However, if we subtract the average signal value from the original signal value, (finding g(t)=f(t)-MEAN(f)) and then run the autocorrelation function, we get a much more striking view of the autocorrelation of the data: Most of the time the signal value is close to 1, with occasional excursions away from that value. Whilst there appear to be a couple of peaks in the data, it’s quite hard to read, because the variance of original signal is not so great. The second trace is the autocorrelation, over all timeshifts. The top trace shows the original time series data – in this case the search volume (arbitrary units) of the term “flowers” over the last few years, with a sample frequency of once per week. The numbers in the array are the search volume values exported from Google Trends. I used the Python matplotlib to calculate the autocorrelation using this gist. (Wikipedia is as good as anywhere to look up the formal definition of autocorrelation.) Essentially, it is calculated from the product of the signal at each sample point with a timeshifted version of itself. ![]() Autocorrelation measures the extent to which a signal is correlated with (i.e. To start with, let’s consider the autocorrelation of the trend data. ![]() (*I discovered this in my PhD, when I noticed that the equations used to describe evolution in genetic populations in discrete and continuous models were the same as equations used to describe different sorts of low pass filters in electronics which means that under the electronics inspired interpretation of the biological models, we could by inspection say populations track low frequency components (components with a periodicity over 10s of generations) and ignore high frequency components. Time series analysis is also widely used in biology, economics etc etc, though the approach or interpretation taken in different disciplines may be different* – if you can help bridge my (lack of) engineering understanding with a biological or economic perspective/interpretation, please do so -) So one thing I intend to do over the next quarter is something of a refresher in signal processing/time series analysis (which is to say, I would appreciate comments on at least three counts: firstly, if I make a mistake, please feel free you are obliged to point it out secondly, if I’m missing a trick, or an alternative/better way of achieving a similar or better end, please point it out thirdly, the approach I take will be rediscovering the electronics/engineering take on this sort of analysis. Though I’ve since forgotten much of what I’ve studied then, I can remember the names of many of the techniques and methods, if not how to apply them. Many of the courses I studied related to describing in mathematical terms the structure of “systems” and the analysis of the structure of signals ideal grounding for looking at time series data such as the Google Trends data, and web analytics data. Way back when, my first degree was electronics. Instead, I’m just going to (start) asking a very simple question – can we automatically detect the periodicity in the trend data? The flowers trace actually holds a wealth of secrets – behaviours vary across UK and the US, for example – but for now I’m going to ignore that detail (I’ll return to it in a later post). The trend shows annual periodic behaviour (the same thing happens every year), with a couple of significant peaks showing heavy search volumes around the term on two separate occasions, a lesser blip between them and a small peak just before Christmas can you guess what these occasions relate to? -) The data itself can be downloaded in a tatty csv file from the link at the bottom left of the page (tatty because several distinct CSV data sets are contained in the CSV file, separated by blank lines.) The sampling frequency is once per week. Take the following image, for example, which is taken from Google Trends and shows relative search volume for the term “flowers” over the last few years: One of the thing many things we’re all pretty good, partly because of the way we’re wired, is spotting visual patterns.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |