Saturday, June 28, 2008

Tide data filtering in Matlab and the mysterious filtfilt issue

Last week I found myself trying to deal with a ton of tide data retrieved from a tide gauge installed in Castine, ME. The gauge takes 4 seconds to capture a data point. Raw (unconverted) water level is recorded first, followed by raw temperature 4 seconds later, and a reference value 4 seconds after that. A full record, therefore, is generated every 12 seconds. This produces a lot of data.

In order to facilitate the comparison of this gauge data to a NOAA primary station, I wanted to automate the task of determining the higher highs and lower lows. A six-minute averaging scheme was developed in Matlab by my friend Val in order to generate a six-minute record similar to the NOAA format. This still left us with squared peaks, so I decided to apply a filter to the data.

Problem: I wanted to use filtfilt, which filters data in the forward direction and then again in the reverse direction, resulting in a zero phase distortion. I tested filtfilt out on the full 12-second water level data and everything seemed hunky dory. The problem came when I tried it out on the 6-minute averaged data. The filter seemed to run okay with no errors, but the result was all NaN (no data) values. At first I thought it was caused by zeros in my vector data, so I changed all 0.00 into 0.01. This did not solve the problem. To make the problem even more frustrating, filtfilt worked on every other column of data coming out of the 6-min-averaging scheme except water level (time, std, etc.). What is up with that?

Work around: I used the convn function in Matlab in order to apply a convolution to the data. This filter can cause a phase distortion, however, by specifying the shape as 'same', the central part of the convolution is returned and the distortion is minimized.

For the comparison of our preliminary gauge to the NOAA primary gauge in Bar Harbor, a slight phase distortion will not affect the results. In order to really be correct, however, a filter that guarantees no phase distortion should have been used. If anyone has any ideas why filtfilt would not work on a running-averaged dataset, I am all ears. Even one of my profs could not figure it out.

I must say though, the results of my code using convn do look pretty nice on first inspection. Rounded peaks and automatically picked highs and lows (using the extrema function, modified so that returns are in linear time order, not descending):


12 comments:

  1. I found your blog via googling for 'matlab filtfilt nan', since I was having the same problem. It turns out one of the entries in the unfiltered array was -inf, and changing it to 0 fixed the problem. (I can imagine 0 could be bad, too, so if that doesn't work, maybe try something else.)

    ReplyDelete
  2. I found your post in an identical way to John (previous comment).

    I was filtering a dataset which contained no NaN's, no inf's, etc. Filtfilt was giving no error, yet the result was all NaN's. If I filtered a small part of the dataset (e.g. first 100 points) the filtered series 'blew up' quickly.

    After an extremely frustrating couple of months doing numerous work-arounds, I finally found my problem.

    My input data series was of type 'single', rather than 'double'. To change the data to double precision, just use

    >> double(indataset)

    This worked for me, maybe it might work for you. The 'single', 'double' property may have been set by the way the data was recorded on the tide gauge.

    Phil.

    ReplyDelete
  3. This update is long overdue, but I just wanted to thank both John and Phil for their suggestions.

    Here is what I have tried:

    I had a few 0 values, so I tried changing them to 0.0001 instead. This did not work. I also tried removing my negative values (only 3 of them) just to see if that helped for any reason, it didn't.

    The column of data I am attempting to use filtfilt on is already catergorized as a double. I tried reassigning it just in case and then reapplied filtfilt and again got all NaNs.

    Right now this issue is on the back burner for me (as the project is over and the work around seemed to do the trick), but I know this is going to come back and haunt me at some point.

    ReplyDelete
  4. I was having the exact same issue with a whole vector of NaN. after spending time playing with the filter types and not being able to figure out what was wrong, i found this blog.
    Thanks to Phil, i was reminded about the fact that the variable needs to be of type DOUBLE! i have had this error before but had forgotten the solution.

    while the 'double' may not be the solution to all the filtfilt problems, it is probably a good thing to try first

    ReplyDelete
  5. I also ended up here after googling 'matlab filtfilt nan' and had similar problems to everyone else. My problem was that waaaaay at the end of the data set were a few characters that Matlab couldn't recognize from the original data file left from the data recorder being cut off. It's easy to miss what's going on at the end of a 6MB dataset.

    ReplyDelete
  6. Hey, I found your blog while facing the same problem with filtfilt. I however do not have any inf and the data is double. :/

    ReplyDelete
  7. I know it's been several years since you made this blog post, but I found your post looking for answers for the very same filtfilt issue, and having just solved my problem, I thought I'd post a solution here and see if it could help anyone else.

    I was filtering EEG (electroencephalography) data with an IIR filter program that used filtfilt.

    My data was in the form of a very long (900 000+) row vector, and when I tried running it whole through filtfilt, I would get a row of NaNs. But when I tried a much smaller section of data, taken from anywhere in the vector, filtfilt worked!

    In my case, for whatever reason, if the vector was longer than 502088 columns, filtfilt didn't work.

    -Gabrielle

    ReplyDelete
  8. Gabrielle,
    Thank you for your post! I'll have to dredge up that old data and see if that was part of my problem. I cannot remember how big my array was offhand, but this could, very well be the issue. I cannot remember if I tried sizing it down already or not. Once I have the time to try this solution, I'll post back.

    ReplyDelete
  9. I had exactly the same problem. However, none of the remedies described above worked for me. I had no NaN, Inf, or gaps in my type ‘double’ data. Choping-off part of the data didn’t work either (it’s not an option either since I don’t want to lose data).

    What worked was reducing the order of the filter. I was using Butterworth filter (buttord and butter functions). By playing with the passband and stopband widths, the transition width, and ripple sizes I was able to reduce the filter order until it worked. However, larger data sets required smaller filter orders (the two seems to interact). In the end, I had to change my filter type to elliptic (which has sharper transition between the passband and stopband but larger riples). With elliptic filter (ellipord and ellip functions) I was able to accomplish my task with a smaller filter order.

    To sum up, if none of the above described methods work for you, you might need to reduce your filter order by adjusting pass- and stopband widths and ripple sizes or changing the filter type altogether.

    ReplyDelete
  10. The combination of two things worked for me. First as Martynas said, I reduced my filter order by using an elliptical filter. Second I padded my signal with zeros on both sides, and made sure the transition from those zeros to signal is smooth . Then my filter produced stable results.

    ReplyDelete
  11. For those where data with no NaNs, no infs and of Double type, what is probably happening is that higher-order filters result in rounding errors of the coefficients coming out of butter(...) or equivalent, and you then have an unstable filter.

    The solution is actually listed under the 'Limitations' section in the MATLAB help for butter(), and involves using zero-pole-gain design instead of filter coefficient design:

    % Zero-Pole-Gain design
    [z, p, k] = butter(n,Wn,ftype);
    [sos,g]=zp2sos(z,p,k);
    h=dfilt.df2sos(sos,g);

    h2 is a filter object that can then be used by filter() in much the same way as normal:

    y = filter(h, x);

    although I have the feeling I read somewhere that filtfilt() doesn't accept these filter objects...

    Anyway, I hope this helps.

    Will

    ReplyDelete
  12. Found this blog also after googling "filtfilt NaN".

    I use the butterworth filter creating a bandpass on acceleration data with cut-off frequencies 0.5 and 20 Hz. My data doesn't contain NaNs, inf or similar.

    It works fine until I reduce the upper cut-off frequency e.g. to 10 Hz. Then I only get NaNs.

    After experimenting a little bit I found that reducing the filter order (originally 10) solves the problem. I found this in the help of butter:

    Limitations


    In general, you should use the [z,p,k] syntaxto design IIR filters. To analyze or implement your filter, you canthen use the [z,p,k] output with zp2sos and an sos dfilt structure. For higher order filters(possibly starting as low as order 8), numerical problems due to roundofferrors may occur when forming the transfer function using the [b,a] syntax


    When I use filter orders <= 8 I can choose any cut-off frequencies without a problem.

    Hope this helps someone with the same problems.

    ReplyDelete