The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages
> Python Programming
|
Trying to Fit a Function to a Specific Region of Data
Discuss Trying to Fit a Function to a Specific Region of Data in the Python Programming forum on Dev Shed. Trying to Fit a Function to a Specific Region of Data Python Programming forum discussing coding techniques, tips and tricks, and Zope related information. Python was designed from the ground up to be a completely object-oriented programming language.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

February 20th, 2013, 02:10 PM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 22
Time spent in forums: 5 h 50 m 28 sec
Reputation Power: 0
|
|
|
Trying to Fit a Function to a Specific Region of Data
Here is the graph:
I want to fit a 1st order polynomial/function between the smallest peak and the largest peak - ignoring all the noisy data on the far left.
How do I go about doing this?
So far, all I have is this:
Code:
x = s.t #time
data = s.firstharmonic #harmonics (y values)
c = (diff(sign(diff(data))) < 0).nonzero()[0] + 1 # local max - finding the peaks of the Harmonics
timemax = [] #list for time values (x values)
maxh = [] #list for peak Harmonics (y values)
for i in c:
timemax.append(x[i])
maxh.append(data[i])
maxdata = max(maxh) #This is the maximum value/peak in the list
def cutoff(maxh):
for i in range(len(maxh)-1):
if not maxh[i] > maxh[i+1]:
startindex = i
startvalue = maxh[i]
return startindex
polyfunc = polyfit(s.t,s.firstharmonic,1) #1st Order polynomial
yfit = polyval(polyfunc,s.t)
The last part of the code is wrong since it doesn't account for or gets stopped at the far left region of data. I just want to iterate and fit over the data from the minimum value to the maximum value
|

February 20th, 2013, 04:04 PM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 22
Time spent in forums: 5 h 50 m 28 sec
Reputation Power: 0
|
|
If I were to put all the peak values into a list (which I have done), how do I start from the last value?
So it would be like I am iterating backwards.. This means I can put a condition similar to:
Code:
for i in range(len(maxh)-1):
if not maxh[i] > maxh[i+1]:
Of course it would be doing it backwards.. So I would start from the maximum peak and reach the lowest peak until I hit the noisy area.. of which the iteration stops.
|

February 20th, 2013, 06:16 PM
|
 |
Contributing User
|
|
|
|
|
itemindex=numpy.where(array==item)
If i and j are determined by
numpy.where(array==item)
you might use
polyfunc = polyfit(s.t[i:j],s.firstharmonic[i:j],1)
?
__________________
[code] Code tags[/code] are essential for python code!
|

February 20th, 2013, 06:41 PM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 22
Time spent in forums: 5 h 50 m 28 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by b49P23TIvg itemindex=numpy.where(array==item)
If i and j are determined by
numpy.where(array==item) |
I do not understand this part.
you might use
Quote: | Originally Posted by b49P23TIvg polyfunc = polyfit(s.t[i:j],s.firstharmonic[i:j],1) |
That would mean that I would have to always explicitly know what i and j are. The Two Stream instability model isn't the same everytime because it uses generators. Its like my previous Landau Damping but in reverse.
I have got this so far:
Code:
x = s.t #time
data = s.firstharmonic #harmonics y values
c = (diff(sign(diff(data))) < 0).nonzero()[0] + 1 # local max peak finder
timemax = []
maxh = []
rmaxh = [] #empty list to fill with reverse maxh values
for i in c:
timemax.append(x[i])
maxh.append(data[i])
#print maxh
for i in reversed(maxh):
print i
rmaxh.append(i)
#print rmaxh
def crap(rmaxh):
for j in range(len(rmaxh)-1):
if not rmaxh[j] > rmaxh[j+1]:
sliceindex = j
slicevalue = rmaxh[j]
return sliceindex
sliceindex = crap(rmaxh)
def twostream(timemax, rmaxh):
timestream = timemax[: sliceindex]
timenoisy = timemax[sliceindex :]
rmaxhstream = rmaxh[: sliceindex]
rmaxhnoisy= rmaxh[sliceindex:]
print timestream
print timenoisy
#coefs = polyfit(timestream, maxhstream,1)
#straightfit = polyval(polyfunc,timestream)
#return timestream, straightfit
#timestream, straightfit, = twostream(timemax, log(rmaxh))
The function "crap" is meant to find where the Harmonics stop growing (or the point where there is just noise), so that I can apply the function fitting there.
This also means I need values of time (x axis) to go along with my 'sliced' Harmonic data. Then I can fit the function along those ranges. It is very crude I know, but it does not work because apparently "timestream" is empty.. or something.
Here is the whole code:
Code:
#!/usr/bin/env python
#
# Electrostatic PIC code in a 1D cyclic domain
from numpy import *
from scipy import *
import matplotlib.pyplot as plt # Matplotlib plotting library
try:
import matplotlib.gridspec as gridspec # For plot layout grid
got_gridspec = True
except:
got_gridspec = False
# Need an FFT routine, either from SciPy or NumPy
try:
from scipy.fftpack import fft, ifft
except:
# No SciPy FFT routine. Import NumPy routine instead
from fft import fft, ifft
def rk4step(f, y0, dt, args=()):
""" Takes a single step using RK4 method """
k1 = f(y0, *args)
k2 = f(y0 + 0.5*dt*k1, *args)
k3 = f(y0 + 0.5*dt*k2, *args)
k4 = f(y0 + dt*k3, *args)
return y0 + (k1 + 2.*k2 + 2.*k3 + k4)*dt / 6.
def calc_density(position, ncells, L):
""" Calculate charge density given particle positions
Input
position - Array of positions, one for each particle
assumed to be between 0 and L
ncells - Number of cells
L - Length of the domain
Output
density - contains 1 if evenly distributed
"""
# This is a crude method and could be made more efficient
density = zeros([ncells])
nparticles = len(position)
dx = L / ncells # Uniform cell spacing
for p in position / dx: # Loop over all the particles, converting position into a cell number
plower = int(p) # Cell to the left (rounding down)
offset = p - plower # Offset from the left
density[plower] += 1. - offset
density[(plower + 1) % ncells] += offset
# nparticles now distributed amongst ncells
density *= float(ncells) / float(nparticles) # Make average density equal to 1
return density
def periodic_interp(y, x):
"""
Linear interpolation of a periodic array y at index x
Input
y - Array of values to be interpolated
x - Index where result required. Can be an array of values
Output
y[x] with non-integer x
"""
ny = len(y)
if len(x) > 1:
y = array(y) # Make sure it's a NumPy array for array indexing
xl = floor(x).astype(int) # Left index
dx = x - xl
xl = ((xl % ny) + ny) % ny # Ensures between 0 and ny-1 inclusive
return y[xl]*(1. - dx) + y[(xl+1)%ny]*dx
def fft_integrate(y):
""" Integrate a periodic function using FFTs
"""
n = len(y) # Get the length of y
f = fft(y) # Take FFT
# Result is in standard layout with positive frequencies first then negative
# n even: [ f(0), f(1), ... f(n/2), f(1-n/2) ... f(-1) ] Only want the first and the last
# n odd: [ f(0), f(1), ... f((n-1)/2), f(-(n-1)/2) ... f(-1) ]
"""Just interested in the first harmonic for the fastest growing mode"""
if n % 2 == 0: # If an even number of points (if not then n is odd)
k = concatenate( (arange(0, n/2+1), arange(1-n/2, 0)) )
else:
k = concatenate( (arange(0, (n-1)/2+1), arange( -(n-1)/2, 0)) )
k = 2.*pi*k/n
# Modify frequencies by dividing by ik
f[1] /= (1j * k[1])
f[n-1] /= (1j * k[n-1]) #Have cut out alot of harmonics here
f[2:n] = 0.0
f[0] = 0. # Set the arbitrary zero-frequency term to zero
return ifft(f).real # Reverse Fourier Transform
def pic(f, ncells, L):
""" f contains the position and velocity of all particles
"""
nparticles = len(f)/2 # Two values for each particle
pos = f[0:nparticles] # Position of each particle
vel = f[nparticles:] # Velocity of each particle
dx = L / float(ncells) # Cell spacing
# Ensure that pos is between 0 and L
pos = ((pos % L) + L) % L
# Calculate number density, normalised so 1 when uniform
density = calc_density(pos, ncells, L)
# Subtract ion density to get total charge density
rho = density - 1.
# Calculate electric field
E = -fft_integrate(rho)*dx
# Interpolate E field at particle locations
accel = -periodic_interp(E, pos/dx)
# Put back into a single array
return concatenate( (vel, accel) )
####################################################################
def run(pos, vel, L, ncells=None, out=[], output_times=linspace(0,20,100), cfl=0.5):
if ncells == None:
ncells = int(sqrt(len(pos))) # A sensible default
dx = L / float(ncells)
f = concatenate( (pos, vel) ) # Starting state
nparticles = len(pos)
time = 0.0
for tnext in output_times:
# Advance to tnext
stepping = True
while stepping:
# Maximum distance a particle can move is one cell
dt = cfl * dx / max(abs(vel))
if time + dt >= tnext:
# Next time will hit or exceed required output time
stepping = False
dt = tnext - time
#print "Time: ", time, dt
f = rk4step(pic, f, dt, args=(ncells, L))
time += dt
# Extract position and velocities
pos = ((f[0:nparticles] % L) + L) % L
vel = f[nparticles:]
# Send to output functions
for func in out:
func(pos, vel, ncells, L, time)
return pos, vel
####################################################################
#
# Output functions and classes
#
class Plot:
"""
Displays three plots: phase space, charge density, and velocity distribution
"""
def __init__(self, pos, vel, ncells, L):
d = calc_density(pos, ncells, L)
vhist, bins = histogram(vel, int(sqrt(len(vel))))
vbins = 0.5*(bins[1:]+bins[:-1])
# Plot initial positions
if got_gridspec:
self.fig = plt.figure()
self.gs = gridspec.GridSpec(4, 4)
ax = self.fig.add_subplot(self.gs[0:3,0:3])
self.phase_plot = ax.plot(pos, vel, '.')[0]
ax.set_title("Phase space")
ax = self.fig.add_subplot(self.gs[3,0:3])
self.density_plot = ax.plot(linspace(0, L, ncells), d)[0]
ax = self.fig.add_subplot(self.gs[0:3,3])
self.vel_plot = ax.plot(vhist, vbins)[0]
else:
self.fig = plt.figure()
self.phase_plot = plt.plot(pos, vel, '.')[0]
self.fig = plt.figure()
self.density_plot = plt.plot(linspace(0, L, ncells), d)[0]
self.fig = plt.figure()
self.vel_plot = plt.plot(vhist, vbins)[0]
plt.ion()
plt.show()
def __call__(self, pos, vel, ncells, L, t):
d = calc_density(pos, ncells, L)
vhist, bins = histogram(vel, int(sqrt(len(vel))))
vbins = 0.5*(bins[1:]+bins[:-1])
self.phase_plot.set_data(pos, vel) # Update the plot
self.density_plot.set_data(linspace(0, L, ncells), d)
self.vel_plot.set_data(vhist, vbins)
plt.draw()
class Summary:
def __init__(self):
self.t = []
self.firstharmonic = []
def __call__(self, pos, vel, ncells, L, t):
# Calculate the charge density
d = calc_density(pos, ncells, L)
# Amplitude of the first harmonic
fh = 2.*abs(fft(d)[1]) / float(ncells)
print "Time:", t, "First:", fh
self.t.append(t)
self.firstharmonic.append(fh)
####################################################################
#
# Functions to create the initial conditions
#
def landau(npart, L, alpha=0.2):
"""
Creates the initial conditions for Landau damping
"""
# Start with a uniform distribution of positions
pos = random.uniform(0., L, npart)
pos0 = pos.copy()
k = 2.*pi / L
for i in range(10): # Adjust distribution using Newton iterations
pos -= ( pos + alpha*sin(k*pos)/k - pos0 ) / ( 1. + alpha*cos(k*pos) )
# Normal velocity distribution
vel = random.normal(0.0, 1.0, npart)
return pos, vel
def twostream(npart, L, vbeam=2):
# Start with a uniform distribution of positions
pos = random.uniform(0., L, npart)
# Normal velocity distribution
vel = random.normal(0.0, 1.0, npart)
np2 = int(npart / 2)
vel[:np2] += vbeam # Half the particles moving one way
vel[np2:] -= vbeam # and half the other
return pos,vel
####################################################################
if __name__ == "__main__":
#vlist = [1., 5., 10.]
# 2-stream instability
#for v in vlist:
L = 100
ncells = 20
pos, vel = twostream(10000, L, 3.)
# Create some output classes
#p = Plot(pos, vel, ncells, L) # This displays an animated figure
s = Summary() # Calculates, stores and prints summary info
# Run the simulation
pos, vel = run(pos, vel, L, ncells,
out=[s],
#out=[p, s], # These are called each output
output_times=linspace(0.,40,100)) # The times to output
x = s.t #time
data = s.firstharmonic #harmonics y values
c = (diff(sign(diff(data))) < 0).nonzero()[0] + 1 # local max
timemax = []
maxh = []
rmaxh = []
for i in c:
timemax.append(x[i])
maxh.append(data[i])
#print maxh
for i in reversed(maxh):
print i
rmaxh.append(i)
#print rmaxh
def crap(rmaxh):
for j in range(len(rmaxh)-1):
if not rmaxh[j] > rmaxh[j+1]:
sliceindex = j
slicevalue = rmaxh[j]
return sliceindex
sliceindex = crap(rmaxh)
def twostream(timemax, rmaxh):
timestream = timemax[: sliceindex]
timenoisy = timemax[sliceindex :]
rmaxhstream = rmaxh[: sliceindex]
rmaxhnoisy= rmaxh[sliceindex:]
print timestream
print timenoisy
#coefs = polyfit(timestream, maxhstream,1)
#straightfit = polyval(polyfunc,timestream)
#return timestream, straightfit
#timestream, straightfit, = twostream(timemax, log(rmaxh))
# Summary stores an array of the first-harmonic amplitude
# Make a semilog plot to see exponential damping
plt.figure()
plt.plot(s.t, s.firstharmonic)
plt.plot(timemax, maxh, 'or')
plt.xlabel("Time [Normalised]")
plt.ylabel("First harmonic amplitude [Normalised]")
#plt.yscale('log')
#plt.legend()
plt.ioff() # This so that the windows stay open
plt.show()
I have not included our revised technique for the density function because this is a fresh code..
OR
Maybe I should just fit the function starting from the lowest peak to the highest peak of harmonics, which includes any other peaks within that range.
From multiple runs, I can see that the growth rate usually starts from the smallest peak and ends at the highest peak.
EDIT:
I think I understand your itemindex thing now.. But please clarify just in case. If I were to use that on the highest peak "max(maxh)", would it give me an "i" of where it is in the list of maxh values?
|

February 20th, 2013, 07:13 PM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 22
Time spent in forums: 5 h 50 m 28 sec
Reputation Power: 0
|
|
Code:
for i in range(len(maxh)-1):
goodvalues = where(maxh[i] > min(maxh))
print goodvalues
I don't get it
Here is another picture:
Just trying to fit between the smallest peak and the highest peak since that is where the growth of the Harmonic appears to take place.
|

February 20th, 2013, 08:23 PM
|
 |
Contributing User
|
|
|
|
Sorry, I misread or found bad description of numpy.where.
I think you mean to do something like in this demonstration? I ran the code redirecting output, then graphed
$ python p.py > /tmp/a
$ gnuplot
gnuplot> plot'/tmp/a'w l
Code:
import numpy
################ generate some data, parabola plus sinusoid
x = numpy.arange(9999.0) / 9998.0
TAU = 2*numpy.pi # http://tauday.com/
xi = (8*TAU) * x
y = numpy.sin(xi) + (x-0.3)**2
################ find the positions of the local maxima
local_maxima = [] # array of indexes
i = 0
while i < (len(y)-2):
while (i < (len(y)-2)) and (y[i] < y[i+1]): # increasing
i += 1
local_maxima.append(i)
i += 1
while (i < (len(y)-2)) and (y[i+1] < y[i]): # decreasing
i += 1
i += 1
################ bracket the data to fit
ymaxes = numpy.array(y[local_maxima]) # values of max
right = numpy.argmax(ymaxes[1:])+1 # left because I know what the data looks like
left = numpy.argmax(-ymaxes[:-1]) # smallest maximum
R = local_maxima[right]
L = local_maxima[left]
best_fit_line = numpy.polyfit(x[L:R], y[L:R], 1)
for xy in zip(x,y):
print('{} {}'.format(*xy))
print('')
ex = x[L:R:8]
yfit = numpy.polyval(best_fit_line,ex)
for a in zip(ex, yfit):
print('{} {}'.format(*a))
Last edited by b49P23TIvg : February 20th, 2013 at 08:36 PM.
|

February 20th, 2013, 08:42 PM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 22
Time spent in forums: 5 h 50 m 28 sec
Reputation Power: 0
|
|
Maybe I am just dumb, but I have not a clue what any of this:
Code:
################ bracket the data to fit
ymaxes = numpy.array(y[local_maxima]) # values of max
right = numpy.argmax(ymaxes[1:])+1 # left because I know what the data looks like
left = numpy.argmax(-ymaxes[:-1]) # smallest maximum
R = local_maxima[right]
L = local_maxima[left]
best_fit_line = numpy.polyfit(x[L:R], y[L:R], 1)
for xy in zip(x,y):
print('{} {}'.format(*xy))
Is supposed to be doing..
I plotted the sinusoidal (y) and got a silly looking distribution that looks like sine loosely fit on a parabola. I can't see which part of the distribution is attempting to have a fit on it.
|

February 20th, 2013, 09:54 PM
|
 |
Contributing User
|
|
|
|
|
Figure attached. The line fit to all the data between the smallest and greatest local maxima.
|

February 21st, 2013, 03:45 AM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 22
Time spent in forums: 5 h 50 m 28 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by b49P23TIvg Figure attached. The line fit to all the data between the smallest and greatest local maxima. |
Did the example include the minimums as well? Sure looks like it
Ok, I will try again thanks.
|

February 21st, 2013, 05:02 AM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 22
Time spent in forums: 5 h 50 m 28 sec
Reputation Power: 0
|
|
I decided to copy the exact above method. Now I get this error:
Code:
x = s.t #time
y = s.firstharmonic #harmonics y values
local_maxima = [] # array of indexes
i = 0
while i < (len(y)-2):
while (i < (len(y)-2)) and (y[i] < y[i+1]): # increasing
i += 1
local_maxima.append(i)
i += 1
while (i < (len(y)-2)) and (y[i+1] < y[i]): # decreasing
i += 1
i += 1
print local_maxima
ymax = array(y[local_maxima])
Code:
error: ymax = array(y[local_maxima])
list indices must be integers, not list
It just feels like I'm chasing my own shadow now, problem after problem.
Both y and local_maxima are lists. Local_maxima is the list of integers (indices of the maximums), whilst y is just the list of my data.
In the example, this method is working. Why does it not work for me?
|

February 21st, 2013, 05:43 AM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 22
Time spent in forums: 5 h 50 m 28 sec
Reputation Power: 0
|
|
Finally got this piece of **** working:
Code:
ymax = [y[int(j)] for j in local_maxima]
Now onto setting the boundaries for fitting...
EDIT:
What is the code for your plotting? My best fit line keeps going across the whole graph and not across the rightmost region.
|

February 21st, 2013, 06:45 AM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 22
Time spent in forums: 5 h 50 m 28 sec
Reputation Power: 0
|
|
Ok I think I have managed to do this. I woke up 9.30am this morning and immediately started programming. It is now 12:41pm and I am in need of a shower and some R & R.
How does this look?
Here is the relevant part of the code:
Code:
x = s.t #time
y = s.firstharmonic #harmonics
maxh = []
timemax = []
c = (diff(sign(diff(y))) < 0).nonzero()[0] + 1 # local maximum actual values
for a in c:
timemax.append(x[a])
maxh.append(y[a])
################ find the positions of the local maxima
local_maxima = [] # array of indexes
i = 0
while i < (len(y)-2):
while (i < (len(y)-2)) and (y[i] < y[i+1]): # increasing
i += 1
local_maxima.append(i)
i += 1
while (i < (len(y)-2)) and (y[i+1] < y[i]): # decreasing
i += 1
i += 1
################ bracket the data to fit
ymax = array([y[int(j)] for j in local_maxima]) #Turning the list of ymax values into an array
right = argmax(ymax[1:])+1
left = argmax(-ymax[:-1])
R = local_maxima[right]
L = local_maxima[left]
best_fit_line = polyfit(x[L:R], y[L:R], 1)
yfit = polyval(best_fit_line, x)
# Summary stores an array of the first-harmonic amplitude
# Make a semilog plot to see exponential damping
plt.figure()
plt.plot(s.t, s.firstharmonic)
plt.plot(timemax, maxh, 'or')
plt.plot(x[L:R], yfit[L:R])
plt.xlabel("Time [Normalised]")
plt.ylabel("First harmonic amplitude [Normalised]")
#plt.yscale('log')
#plt.legend()
plt.ioff() # This so that the windows stay open
plt.show()
Sigh.. thanks once again for the assistance b49. I have one more thing to ask, but it is better to open a new topic.
It is more specific to PIC and involves "Quiet Start Conditions".
|

February 21st, 2013, 09:28 AM
|
 |
Contributing User
|
|
|
|
0) very good, you solved all these issues.
1) I use gnuplot---your green line shows you solved the problem. Quote: | Originally Posted by me $ python p.py > /tmp/a
$ gnuplot
gnuplot> plot'/tmp/a'w l |
2) I assumed you wanted to use all the data you'd computed for the curve in your fit, but choosing the boundaries from the smallest to greatest maxima. Quote: | Originally Posted by me The line fit to all the data between the smallest and greatest local maxima. | Yes, I certainly could have specified instead of "data", "all y of post 9:23 PM".
3) Concerning
error: ymax = array(y[local_maxima])
list indices must be integers, not list
y must have been a list not a numpy array. You solved this with a list comprehension. I'd think the following would also repair the situation:
ymax = array(y)[local_maxima]
|

February 24th, 2013, 02:55 PM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 22
Time spent in forums: 5 h 50 m 28 sec
Reputation Power: 0
|
|
|
Ignore
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|