In this work, we provide the framework to analyze a multiresolution partition (e.g. country, provinces, subdistrict) where each individual data point belongs to only one partition in each layer (e.g. \(i\) belongs to subdistrict \(A\), province \(P\), and country \(Q\)).
We assume that a partition in a higher layer subsumes lower-layer partitions (e.g. a nation is at the 1st layer subsumes all provinces at the 2nd layer).
Given \(N\) individuals that have a pair of real values \((x,y)\) that generated from independent variable \(X\) and dependent variable \(Y\). Each individual \(i\) belongs to one partition per layer.
Our goal is to find which partition at which highest level that all individuals in the this partition share the same linear model \(Y=f(X)\) where \(f\) is a linear function.
Explanation: FindMaxHomoOptimalPartitions(DataT,gamma)
INPUT: DataT$clsLayer[i,k] is the cluster label of ith individual in kth cluster layer.
OUTPUT: out$Copt[p,1] is equal to k implies that a cluster that is a pth member of the maximal homogeneous partition is at kth layer and the cluster name in kth layer is Copt[p,2]
OUTPUT: out$Copt[p,3] is “Model Information Reduction Ratio” of pth member of the maximal homogeneous partition: positive means the linear model is better than the null model.
OUTPUT: out$Copt[p,4] is \(\eta( {C} )_{\text{cv}}\) of pth member of the maximal homogeneous partition. The greater Copt[p,4], the higher homogeneous degree of this cluster.
OUTPUT: out$models[[k]][[j]] is the linear regression model of jth cluster in kth layer.
OUTPUT: out$models[[k]][[j]]$clustInfoRecRatio is the “Cluster Information Reduction Ratio” between the jth cluster in kth layer and its children clusters in (k+1)th layer: positive means current cluster is better than its children clusters. Hence, we should keep this cluster at the member of maximal homogeneous partition instead of its children.
library(MRReg)
## Loading required package: caret
## Loading required package: ggplot2
## Loading required package: lattice
# Generate simulation data type 4 by having 100 individuals per homogeneous partition.
DataT<-SimpleSimulation(100,type=4)
gamma <- 0.05 # Gamma parameter
out<-FindMaxHomoOptimalPartitions(DataT,gamma)
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
## Warning in summary.lm(submodels[[inx2]]): essentially perfect fit: summary may
## be unreliable
#Plotting optimal homogeneous tree The red nodes are homogeneous partitions. All children of a homogeneous partition node share the same linear model.
plotOptimalClustersTree(out)
#Printing optimal homogeneous partitions Selected features: 1 is reserved for an intercept, and d is a selected feature if Y[i] ~ X[i,d-1] in linear model Note that the clustInfoRecRatio values are always NA for last-layer partitions.
PrintOptimalClustersResult(out, selFeature = TRUE)
## [1] "========== List of Optimal Clusters =========="
## [1] "Layer2,ClS-C1:clustInfoRecRatio=0.08,modelInfoRecRatio=0.54, eta(C)cv=1.00"
## [1] "Selected features"
## [1] 2
## [1] "Layer3,ClS-C11:clustInfoRecRatio=0.10,modelInfoRecRatio=0.67, eta(C)cv=1.00"
## [1] "Selected features"
## [1] 2
## [1] "Layer3,ClS-C12:clustInfoRecRatio=0.10,modelInfoRecRatio=0.65, eta(C)cv=1.00"
## [1] "Selected features"
## [1] 3
## [1] "Layer3,ClS-C13:clustInfoRecRatio=0.09,modelInfoRecRatio=0.52, eta(C)cv=1.00"
## [1] "Selected features"
## [1] 4
## [1] "Layer3,ClS-C14:clustInfoRecRatio=0.09,modelInfoRecRatio=0.45, eta(C)cv=1.00"
## [1] "Selected features"
## [1] 5
## [1] "Layer4,ClS-C21:clustInfoRecRatio=NA,modelInfoRecRatio=0.56, eta(C)cv=1.00"
## [1] "Selected features"
## [1] 2
## [1] "Layer4,ClS-C22:clustInfoRecRatio=NA,modelInfoRecRatio=0.65, eta(C)cv=1.00"
## [1] "Selected features"
## [1] 3
## [1] "Layer4,ClS-C23:clustInfoRecRatio=NA,modelInfoRecRatio=0.69, eta(C)cv=1.00"
## [1] "Selected features"
## [1] 4
## [1] "Layer4,ClS-C24:clustInfoRecRatio=NA,modelInfoRecRatio=0.46, eta(C)cv=1.00"
## [1] "Selected features"
## [1] 5
## [1] "Layer4,ClS-C25:clustInfoRecRatio=NA,modelInfoRecRatio=0.65, eta(C)cv=1.00"
## [1] "Selected features"
## [1] 6
## [1] "Layer4,ClS-C26:clustInfoRecRatio=NA,modelInfoRecRatio=0.43, eta(C)cv=1.00"
## [1] "Selected features"
## [1] 7
## [1] "Layer4,ClS-C27:clustInfoRecRatio=NA,modelInfoRecRatio=0.63, eta(C)cv=1.00"
## [1] "Selected features"
## [1] 8
## [1] "Layer4,ClS-C28:clustInfoRecRatio=NA,modelInfoRecRatio=0.66, eta(C)cv=1.00"
## [1] "Selected features"
## [1] 9
## [1] "min eta(C)cv:1.000000"