The user must be familiar with Artificial Neural Networks for using this plugin efficiently
NeuralNet has:
A) classic 3 layer classification type neural network with
only one output node.In the two functions included the optimization is done differently:
1) NeuralNetBP by usual back propogation
2) NeuralNetRP Resilient Propogation or RPROP (Reidmiller and Braun)
The output of these two functions is either yes (255 ) or No (0)
B)NeuralNetLN :Linear network with a single layer and weights optimized by RPROP method. The
output of this is limited to be between 0 and 255.
The plugin has been tested with:
A)NeuralNetBP and NeuralNetRP: greyscale image as input and a corresponding
edge detected image wherein
edges (above a threshold) were marked as white(255)
and non edges as black(0). Training was done using a representative window of
this frame. The solution was tested on full frame .
B) NeuralNetLN : An image corrupted by regular freq interference
and its 'VFanFilter'-ed output set. Note for high
starting weights, results may be inferior.
RP converges faster than BP and so requires fewer iterations. However since the minimization criteria are different, the results can be different. In certain situations one method gives better results than other.
The start frame of input clip must be the frame to be used for training. The training clip must have the processed result corresponding to the start frame of input clip.
Training need to be done in a small window (or along a line) having fully representative cases and having full image amplitude range, as otherwise strange unexpected results may be seen. Time for training Depends upon window size, number of nodes, iterations and 'bestof' value and may take a few seconds to several minutes.
After training, the optimized weights are used for processing all input clip frames. NeuralNet RP and BP outputs all values above a threshold as white, and rest as black.
Facility to monitor error or other diagnostic parameters during training is provided. First output frame will always be a frame with diagnostic plot (and for classification type a histogram of the processed first frame). The diagnostic plot horizontal scale is number of iteration and vertical is scaled parameter value. For Histogram horizontal scale is %age of image y value. The window outline or line used for training is also shown on this plot
For BP and RP if 'test' is true, the input start frame 'sf' is repeatedly processed with NeuralNet solution and the output displayed using threshold values in 'ef' steps from 0 to 100 % with each frame step. If 'ef' is 50 then 50 steps will be used or 2% increase at each frame step occurs
In certain cases after some iterations the error may reach a local minimum, or may start to increase with iteration. Sometimes retraining with a different set of starting weights may correct this problem. 'Bestof' parameter processes with different weight sets and saves the weights which resulted in least error for later use. However this multiplies training time. 'wset' parameter skips weight sets
This plugin works in YUY2, YV12, RGB32 and RGB24 color spaces. Only Red or Luma Y channel is processed.
The input pattern is considered as a grid of 'xpts' * 'ypts'. The output is the value for the center of this grid on the training clip. The input grid is then moved by one pixel and compared with the corresponding output. The training line or window size is to be selected to have sufficient number of cases and so positioned that it encompasses all representative cases.
Incase the training clip was obtained with a grid of x by y, then specifying values other than x and y for xpts and ypts to the network may not be good idea as extraneous data will only add noise and the network may not converge.
Most of the parameters for the three functions are same. While all(except clips) have default values some parameter (training window, hnodes, weight) values need to be specified for better results. Also parameter names may be invariably used for specifying
Below is an example of NeuralNetLN. On left is NeuralNetLN output, center is input,
on right is the VFAN filtered frame from which a line is used for training.The script used is
f=imagereader("D:\TransPlugins\images\msintrf0.jpg",0,1,25,false).converttoyuy2()#image with noise
i=imagereader("D:\TransPlugins\images\fanmsint0.jpeg",0,1,25,false).converttoyuy2()#fan filtered image
ln=NeuralNetLN(f,i,tlx=250,trx=450,tby=400,xpts=9,ypts=1,iter=200,wh=0.01, wset=1,bestof=1,line=true)
stackhorizontal(ln,f,i)
reduceby2()
To my index page | down load plugin | To Avisynth |