sPipeline
is supposed to finish ab inito training for the input
data. It returns an object of class "sMap".
sPipeline(data = NULL, xdim = NULL, ydim = NULL, nHex = NULL, lattice = c("hexa", "rect"), shape = c("suprahex", "sheet", "triangle", "diamond", "hourglass", "trefoil", "ladder", "butterfly", "ring", "bridge"), scale = 5, init = c("linear", "uniform", "sample"), algorithm = c("batch", "sequential"), alphaType = c("invert", "linear", "power"), neighKernel = c("gaussian", "bubble", "cutgaussian", "ep", "gamma"), finetuneSustain = FALSE, verbose = TRUE)
an object of class "sMap", a list with following components:
nHex
: the total number of hexagons/rectanges in the grid
xdim
: x-dimension of the grid
ydim
: y-dimension of the grid
r
: the hypothetical radius of the grid
lattice
: the grid lattice
shape
: the grid shape
coord
: a matrix of nHex x 2, with rows corresponding to
the coordinates of all hexagons/rectangles in the 2D map grid
polygon
: a data frame of three columns ('x','y','id')
storing polygon location per hexagon in the 2D map grid
init
: an initialisation method
neighKernel
: the training neighborhood kernel
codebook
: a codebook matrix of nHex x ncol(data), with
rows corresponding to prototype vectors in input high-dimensional
space
hits
: a vector of nHex, each element meaning that a
hexagon/rectangle contains the number of input data vectors being hit
wherein
mqe
: the mean quantization error for the "best" BMH
call
: the call that produced this result
The pipeline sequentially consists of:
sTopology
used to define the topology of a grid
(with "suprahex" shape by default ) according to the input data;
sInitial
used to initialise the codebook matrix
given the pre-defined topology and the input data (by default using
"uniform" initialisation method);
sTrainology
and sTrainSeq
used
to get the grid map trained at both "rough" and "finetune" stages. If
instructed, sustain the "finetune" training until the mean quantization
error does get worse;
sBMH
used to identify the best-matching
hexagons/rectangles (BMH) for the input data, and these response data
are appended to the resulting object of "sMap" class.
Hai Fang and Julian Gough. (2014) supraHex: an R/Bioconductor package for tabular omics data analysis using a supra-hexagonal map. Biochemical and Biophysical Research Communications, 443(1), 285-289.
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") # 2) get trained using by default setup but with different neighborhood kernels # 2a) with "gaussian" kernel sMap <- sPipeline(data=data, neighKernel="gaussian")Start at 2018-01-18 16:56:09 First, define topology of a map grid (2018-01-18 16:56:09)... Second, initialise the codebook matrix (61 X 10) using 'linear' initialisation, given a topology and input data (2018-01-18 16:56:09)... Third, get training at the rough stage (2018-01-18 16:56:09)... 1 out of 7 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 2 out of 7 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 3 out of 7 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 4 out of 7 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 5 out of 7 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 6 out of 7 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 7 out of 7 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) Fourth, get training at the finetune stage (2018-01-18 16:56:09)... 1 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 2 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 3 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 4 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 5 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 6 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 7 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 8 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 9 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 10 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 11 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 12 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 13 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 14 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 15 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 16 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 17 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 18 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 19 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 20 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 21 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 22 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 23 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 24 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) 25 out of 25 (2018-01-18 16:56:09) updated (2018-01-18 16:56:09) Next, identify the best-matching hexagon/rectangle for the input data (2018-01-18 16:56:09)... Finally, append the response data (hits and mqe) into the sMap object (2018-01-18 16:56:09)... Below are the summaries of the training results: dimension of input data: 100x10 xy-dimension of map grid: xdim=9, ydim=9, r=5 grid lattice: hexa grid shape: suprahex dimension of grid coord: 61x2 initialisation method: linear dimension of codebook matrix: 61x10 mean quantization error: 4.92761300512866 Below are the details of trainology: training algorithm: batch alpha type: invert training neighborhood kernel: gaussian trainlength (x input data length): 7 at rough stage; 25 at finetune stage radius (at rough stage): from 3 to 1 radius (at finetune stage): from 1 to 1 End at 2018-01-18 16:56:09 Runtime in total is: 0 secs# 2b) with "bubble" kernel # sMap <- sPipeline(data=data, neighKernel="bubble") # 2c) with "cutgaussian" kernel # sMap <- sPipeline(data=data, neighKernel="cutgaussian") # 2d) with "ep" kernel # sMap <- sPipeline(data=data, neighKernel="ep") # 2e) with "gamma" kernel # sMap <- sPipeline(data=data, neighKernel="gamma") # 3) visualise multiple component planes of a supra-hexagonal grid visHexMulComp(sMap, colormap="jet", ncolors=20, zlim=c(-1,1), gp=grid::gpar(cex=0.8)) # 4) get trained using by default setup but using the shape "butterfly" sMap <- sPipeline(data=data, shape="trefoil", algorithm=c("batch","sequential")[2])Start at 2018-01-18 16:56:09 First, define topology of a map grid (2018-01-18 16:56:09)... Second, initialise the codebook matrix (61 X 10) using 'linear' initialisation, given a topology and input data (2018-01-18 16:56:09)... Third, get training at the rough stage (2018-01-18 16:56:09)... 1 out of 700 (2018-01-18 16:56:09) 70 out of 700 (2018-01-18 16:56:09) 140 out of 700 (2018-01-18 16:56:09) 210 out of 700 (2018-01-18 16:56:09) 280 out of 700 (2018-01-18 16:56:09) 350 out of 700 (2018-01-18 16:56:09) 420 out of 700 (2018-01-18 16:56:09) 490 out of 700 (2018-01-18 16:56:09) 560 out of 700 (2018-01-18 16:56:09) 630 out of 700 (2018-01-18 16:56:09) 700 out of 700 (2018-01-18 16:56:09) Fourth, get training at the finetune stage (2018-01-18 16:56:09)... 1 out of 2500 (2018-01-18 16:56:09) 250 out of 2500 (2018-01-18 16:56:10) 500 out of 2500 (2018-01-18 16:56:10) 750 out of 2500 (2018-01-18 16:56:10) 1000 out of 2500 (2018-01-18 16:56:10) 1250 out of 2500 (2018-01-18 16:56:10) 1500 out of 2500 (2018-01-18 16:56:10) 1750 out of 2500 (2018-01-18 16:56:10) 2000 out of 2500 (2018-01-18 16:56:10) 2250 out of 2500 (2018-01-18 16:56:10) 2500 out of 2500 (2018-01-18 16:56:10) Next, identify the best-matching hexagon/rectangle for the input data (2018-01-18 16:56:10)... Finally, append the response data (hits and mqe) into the sMap object (2018-01-18 16:56:10)... Below are the summaries of the training results: dimension of input data: 100x10 xy-dimension of map grid: xdim=11, ydim=11, r=6 grid lattice: hexa grid shape: trefoil dimension of grid coord: 61x2 initialisation method: linear dimension of codebook matrix: 61x10 mean quantization error: 5.91367829503479 Below are the details of trainology: training algorithm: sequential alpha type: invert training neighborhood kernel: gaussian trainlength (x input data length): 7 at rough stage; 25 at finetune stage radius (at rough stage): from 3 to 1 radius (at finetune stage): from 1 to 1 End at 2018-01-18 16:56:10 Runtime in total is: 1 secsvisHexMulComp(sMap, colormap="jet", ncolors=20, zlim=c(-1,1), gp=grid::gpar(cex=0.8))
sPipeline.r
sPipeline.Rd
sPipeline.pdf
sTopology
, sInitial
,
sTrainology
, sTrainSeq
,
sTrainBatch
, sBMH
,
visHexMulComp