Введено вдтань Фреше мiж деревами та доведено, що ця видстань е метрикою. Розроблено метод i алгоритми визначення видсташ мiж не опуклими областями. Спроектований i про-грамно реалiзований модуль визначення вдстат Фреше мiж скелетонами. Дослиджено похибки результатiв сегментацп для метрик Хаусдорфа та Фреше мiж деревами на прикладi бюмедич-них зображень
Ключовi слова: метрика Фреше, метрика Хаусдорфа, не опуклi областi, бюмедичш зобра-ження, похибка сегментации
Введено расстояние Фреше между деревьями и доказано, что это расстояние является метрикой. Разработан метод и алгоритмы определения расстояния между не выпуклыми областями. Спроектирован и программно реализован модуль определения расстояния Фреше между скелетонами. Исследованы погрешностирезуль-татов сегментации для метрик Хаусдорфа и Фреше между деревьями на примере биомедицинских изображений
UDC 004.932
|DOI: 10.15587/1729-4061.2017.1194931
DEVELOPMENT OF A METRIC AND THE METHODS FOR QUANTITATIVE ESTIMATION OF THE SEGMENTATION OF BIOMEDICAL IMAGES
O. Berezsky
Doctor of Technical Sciences, Professor, Head of Department* E-mail: ob@tneu.edu.ua M. Zarichnyi Doctor of Physical and Mathematical Sciences, Professor Department of Geometry and Topology Ivan Franko National University of Lviv Universytetska str., 1, Lviv, Ukraine, 79000 E-mail: zarichnyi@yahoo.com O. Pi ts u n Postgraduate Student* E-mail: o.pitsun@tneu.edu.ua *Department of Computer Engineering Ternopil National Economic University Lvivska str., 11, Ternopil, Ukraine, 46020
Digital microscopy plays an important role in the analysis of biomedical images and in diagnosis. We define biomedical images as the images obtained using the technical tools employed for medical purposes in order to visualize processes. Of special importance is the quality of digital images in oncology. To diagnose a patient in oncology, images of individual cells (cytological images) and the images of a group of cells (histological) are used.
Modern digital microscopy has travelled a long way in its development from hand microscopes to modern robotized complexes.
To categorize digital microscopy, we shall introduce the following criteria: automation, software, the use of network technologies.
According to the first criterion, there are hand microscopes, which are independently operated by laboratory staff. Such microscopes are most common in the clinics of Ukraine. Automatics of the automated microscopes controls the motion and focusing of the preparation, change of filters, lenses, lighting [1]. Hardware of the system of automated microscopy (SAM) consists of a microscope, a video camera, and a computer. The basis of SAM software is the systems of image analysis (SIA). The latest modern group is the automated (robotized) microscopes that enable automatic processing of images. Based on the automated microscopes, modern complexes of automatic microscopy are built (CAM) [2, 3].
In terms of software, there are software systems (SS) that perform systemic functions: control over a microscope, a video camera, etc. Another class of SS is the information systems, in which databases of images, databases of patients, incoming and outgoing reporting documents are implemented. The next class of SS is the systems for analysis of images, in which image processing algorithms are implemented at three levels - low, medium and high. A separate class of SS is the expert systems that can simulate reflections of a physician-expert when diagnosing a patient. The combination of expert systems and systems for analysis of images have spawned hybrid systems that combine technologies of knowledge bases and computer vision. This direction is promising at the present stage. Modern CAM include programs that enable the organization of telemedicine.
The level of using network technologies determines CAM that are implemented on separate workstations. Such use is limited by the local work of individual laboratory staff. Another level is the use of CAM in a local network of a clinic, hospital, which makes it possible for several specialists to work simultaneously. The use of CAM in the global Internet makes it possible to implement the technology of telemedicine.
We analyzed commercially available systems of digital microscopy that vary in price depending on configuration. For example, price of the system BioVision [4] is USD 1,850, VideoTesT-Morpho-5 [5] - USD 3,440, MEKOS-C2 [6] -USD 12,208.
The bulk of the price of such systems is the software.
The base of all SAM and CAM are SIA. SIA include all image processing levels: low, middle and high. Low level is used to improve the quality of images and contains algorithms for image filtering, histogram equalization, contrast improvement, etc. Middle level carries out segmentation, differentiation of image contours, calculation of contour and texture features. High level performs classification of micro objects based on attributes [7, 8].
The main operation at the middle level is segmentation. The accuracy of segmented objects affects subsequent calculations of quantitative attributes. At present, there is no universal image segmentation algorithm. Each algorithm is designed for a specific subject area. That is why assessment of the quality of segmentation results is very important. To assess the quality of segmentation, objective and subjective criteria are applied. Classification criteria for the evaluation of segmentation is given in paper [9].
Subjective estimate is given by a human and is based on qualitative indicators. An objective assessment is based on quantitative indicators. The advantage of employing quantitative indicators to evaluate segmentation is the absence of the human factor. The highest accuracy of segmentation evaluation is demonstrated by methods that are based on the use of metrics.
Therefore, it is important in terms of quantitative estimation of the quality of image segmentation to apply an approach that is based on using the Hausdorff and Frechet metrics [10].
We shall analyze modern algorithms for calculating the Hausdorff and Frechet distances. Algorithms to compute the Hausdorff distance are developed only for convex polygons. In paper [11], authors constructed an algorithm to reduce the number of vertices of a convex polygon for the set error £, in the Hausdorff metric. The algorithm is used for convex polygons only. In article [12], authors calculated the Hausdorff distance between algebraic flat curves using the Vor-onoi diagrams. The algorithm is applied for particular cases of algebraic curves and has high computational complexity. The algorithm for finding the minimal Hausdorff distance in metrics L{ and L„ is reported in work [13]. The resulting computational complexity is O(n2log2n). A search method for a given pattern of the image that has the smallest distance in the Hausdorff metric is given in paper [14]. In this case, the authors use a transmission of the assigned pattern to the desired image. The algorithm has high computational complexity. Article [15] reports a method of finding the minimum weighted tree based on the Hausdorff metric for a ¿-dimensional space. The problem of the approximation of such a tree is solved over polynomial time.
An algorithm to compute the discrete Frechet distance for polygonal curves is given in article [16]. A given algorithm employs groups of conversion of rigid motions. Computational complexity is O(m2n2), where m and n are the number of segments along the first and second curve. In paper [17], authors show an algorithm for computing
the Frechet distance for surfaces, which are represented by simple polygons. The algorithm has polynomial complexity. In work [18], author developed an algorithm to calculate the Frechet distance between two curves, which are assigned by a set of m+n linearly approximated segments. Computational complexity is O(m*n). Authors of [19] obtained, for closed polygonal curves, a computational complexity of O(m*ri) for the Frechet metric. In [20], authors demonstrate an algorithm to compute a discrete Frechet distance with inaccurately assigned vertices. For a ¿-dimensional space, they received computational complexity O(d*m*ri). An algorithm to calculate the Frechet distance between non-flat surfaces is given in paper [21]. The authors reached the polynomial time in the L ^ metric. In article [22], authors developed a fast algorithm for finding a similarity of polygonal trees in the Frechet metric. The algorithm has polynomial complexity. In works [23, 24], finding the distance between parametrically set curves is carried out over polynomial time.
Thus, an analysis of the scientific literature reveals that modern algorithms for finding the Frechet distance for flat closed curves have the least computational complexity O(mxn). Algorithms for finding the Hausdorff distance between regions are computationally complex. There are no efficient algorithms for computing the Hausdorff distance for non-convex regions.
Therefore, it is necessary to develop a metric for finding a distance between non-convex regions. For this purpose, we shall use a description of the image region by skeletons [8]. Skeletons are the middle lines, which describe arbitrary regions and reduce dimensionality of a region description. Thus, skeletons are graphs without cycles, in other words, trees.
Therefore, further direction of present research is to develop a Frechet metric between trees (skeletons), which would make it possible to find distances between arbitrary (convex and non-convex) geometrical objects. In addition, by employing the developed metric, it is necessary to construct a method and algorithms for computing the distance between non-convex regions.
The aim of present study is to introduce a metric between trees to find the distance between non-convex regions of segmented images. This would make it possible to decrease the computational complexity of algorithms that compare non-convex regions.
To accomplish the aim, the following tasks have been set:
- to introduce and substantiate the Frechet metric between trees;
- to devise a method for comparison of non-convex regions of images, based on the Frechet metric between trees;
- to construct algorithms for computing distances between non-convex regions;
- to perform computer experiments to find distances between non-convex regions.
A tree is a connected graph without cycles. A tree is called a root tree if it contains a selected point (root) [26].
We shall consider a topological tree in Rn(R2), that is, a topological embedding of tree graphs. Thus, a topological tree is a triple (T, a, x0), where TeR", x0 e T and there is graph-tree T and an embedding f: T^R", a(T)=T, f(x0)=x0.
At each topological tree T there exists, therefore, two metrics. The first is induced from R" (denoted as d) and the second is induced from the geodesic metric into T by the mapping a -denoted as pT. How do we find pT(x, y)? It is necessary to take the preimages of a4(x), a4(y) in T, and measure in the graph T the length of the segment connecting these points.
A mapping of graphs is called monotonous if it is monotonous along all segments that originate from the roots.
That is, the mapping f: T&^S is monotonous if f(x0)=y0 (root in S) and, at distance x from x0 along the segment (in the metric pT), point f(x) is monotonously distanced from y0 (in the metric pS).
We shall determine the Frechet distance between topological trees T and S:
dF (T, S ) =
= inf {sup{d(f (x),g(x))|x e R}| f: R ^ T,g: R ^ S}, (1)
here R is a tree and f g are onto maps.
An example of trees mapping is shown in Fig. 1.
Fig. 1. Trees mapping: R,T,S — topological trees
Then f maps the left side R to T, and the right side is along the right side of T. Accordingly, g maps the right R to the right side S, and the left side R to the left side of S, making up the two ends of the fork.
Consider root trees in the plane R2. If (T, t0) is such a tree, then for each t el through ||t|| we denote d(t, t0). Here and hereafter, d denotes the geodesic distance inside the tree, that is, d(t, t0) indicates the length of the arc inside T, which connects t and t0.
Mapping of root trees.
We believe that f:(S, s0)^(T, t0) has the property that f(s0)=t0, and, for each geodesic segment /originating from point t0, we obtain
First, it is necessary to make sure that the given definition is valid. Having T, S, we shall denote through R the bouquet of these two trees, R=TvS (that is, the factor-space of combination T u S relative to the equivalence relation, which identifies point t0 and s0) (Fig. 2).
T S T v S
Fig. 2. A bunch of tree Tand S
Then f:R^T we can accept the mapping that deforms S to a point, and g:R^S, respectively, as the mapping that deforms T to a point.
Theorem. The function D is a metric. Proof.
Next, assume that D(T, S)=0. Hence, it is easy to derive pH=(T, S)=0 (pH denotes the Hausdorff metric on the plane). Thus, T=S.
Let T, S, U be the trees and e>0. There are trees P, R and mapping "onto" f:P^T, g:P^S and k:R^S, h:R^U so that
sup {p( f (p), g (p))| p eP }<D (T, S ) + £,
sup{p(k(r),h(r))|r e R} < T(S,U) + £.
Consider the pullback of trees P and R, that is a tree
Q = {( p, r )eP x R|g (p) = k (r)}.
If Q is not a tree, one may consider an onto map Q&^Q, where Q is a tree.
Let a:Q^P, P:Q(r) be the projections. Then, for each (p,r)eQ, we have (4):
p( f a( p, r)), hp( p, r ) =
= p( f (p), h (r ))<p( f (p), ga( p, r )) + p(kp( p, r), h (r )) = = P p) p( (7 ) + D [S V +2s. (4)
Thus, the theorem proved makes it possible to apply the Frechet metric between trees to find a distance between non-convex regions.
s, s& e/,\\\\S| < | |s f (s )|| <\\\\f (s .
To denote the Frechet metric on the set of all embedded trees in R2, we shall denote as p the Euclidean metric in R2. Assume
D (T ,S) =
= inf {sup{p( f (r),g(r))|r eR}| f: R4. 2. A method for estimating distance between skeletons of the non-convex regions
As a result of the skeletonization process, we obtain the skeleton Sk of the image, which is a tree. Therefore, the skeleton Sk=G=(V, E) is a non-directed graph where V={v0, »1V.., vk} is the set of vertices, E={u0, u1v.., um} is a set of edges. For a non-directed graph, the edge is the set {u, v}, where u, v e V, u ^ v.
For the assigned graph G=(V, E), a path (route) of length k from a vertex u to a vertex u is a sequence of vertices {v0, v1,_, vk} such that u=u0, u=vk, (v;-1,v{)eE for i= =1, 2,_, k. The path includes vertices v0, v1,_, vk and edges T,g: R ^ S}. (3) (v0, v1), (v0, v1),_, (vk-1, vk). If there is a path P from a vertex u
to a vertex u, then one can say that the vertex u is reachable from the vertex u along the path P, that is, u P >u. A path is called simple if all the vertices of the path are different.
Let two connected planar graphs without cycles be assigned (Fig. 3).
Fig. 3. Planar graphs T and S
A merging of graphs T and S is a set of C pairs of (ti, sj), which has the following properties:
in S such that (ti,Sj) e C;
We selected a root t0 in the graph and there are the end vertices. If we take one more similar graph with root s0, then the Frechet distance between these graphs is computed in the following way:
Df (T, 5 ) = mm i max DF ([i0, t1 ],[s0, sj])J.
OP u... u OP u... uOP = V,
OQi u... u O0 u... u O0 = W;
Thus, formula (5) for finding the Frechet distance makes it possible to calculate distances between trees (skeletons) and to construct algorithms for finding a distance between non-convex regions.
Let the two non-convex polygons P and Q received after segmentation be assigned (Fig. 4).
Fig. 4. Polygons P and Q
The algorithm for computing the Hausdorff distance between them is the following:
P = OPt u... u OP u... u OP , where i = 1, n is the quantity of convex regions of polygon P;
Q = OQt u... u OQj u... u OQm, where j = 1, m is the quantity of convex regions of the polygon Q;
d (V,W) = inf {e>0| Vi = 1~n, 3j = Xm
dH O ,Oj )<£ (6)
and vice versa
Vj = 1,m, 3i = 1,n and dH (Ot,Oj)<e,
where dH is the Hausdorff distance;
Trees of the graph can exist without a selected root and with a selected root, that is, free trees. The following theorem holds for them.
Theorem [25]. Let G=(V, E) be a non-directed graph. Then the following holds:
V| -1. -1.
We shall describe the main steps of the algorithm: 1. Following the skeletonization process over two images, we obtain skeletons Sk1 = G1=(V1, E1) and Sk2 = G2=(V2, E2), respectively. Let us represent the obtained trees G1=(V1, E1) and G2=(V2, E2) by applying adjacency matrices and The vertices should be numbered and arranged by numbers 1, 2, ..., |V|. Then the adjacency matrix A=(aj), of size |V| x |V|, is such that
Adjacency matrix A for a non-directed graph is equal to transposed matrix AT. That is why we shall consider only those elements of the matrix, which are located along the main diagonal and above, that is, the submatrix A".
P2={p1, p2,...,pk}, each of which is the subset p\\ = {v0,vr,...,vi} i = 1,k, and p2 = {v0,vr,...,vi} i = 1,r, where k and ris the number of end-vertices of the trees G1 and G2, respectively.
Df p P2 ) = min
m ax Df ([ pi, pi ], [ Po2, p2 ])
( p1P jeC
The polygons that are examined in the present study were obtained following the segmentation of histologic and cytologic images from database [29]. As a result of the skele-tonization process over polygons, we obtained skeletons of the examined micro objects.
Examples of polygons and their skeletons are shown in Table 1.
Examples of polygons and their skeletons
No. of experiment
Polygon of region 1
Polygon of region 2
Skeleton
Skeleton of region 1
Skeleton of region 2
Values of the Hausdorff distance between polygons are given in Table 2.
Values for the Frechet distance for skeletons are given in Table 3.
Values of the Hausdorff distance between polygons
No. Polygon of region 1 Polygon of region 2 The Hausdorff distance, pixels
Values for the Fréchet distance for skeletons
Skeleton of region 1
Skeleton of region 2
The Fréchet distance, pixels
Time for finding the distance between polygons and skeletons is shown in Fig. 5.
-The Hausdorff distance between regions
-The Frechet distance between skeletons
Fig. 5. Time for finding the Hausdorff distance between polygons and the Frechet distance between skeletons
The values of relative error for the Hausdorff distance between polygons and the Frechet distance between skeletons are shown in Fig. 6.
Computer experiments that we performed show that the deviation in the value of the Hausdorff distance between polygons and the Frechet distance between skeletons is within 2-3 %.
Modern algorithms for computing the Hausdorff distance between regions with low computational complexity exist only for the convex regions. Using the Hausdorff metric to calculate distances between the non-convex regions is computationally complicated. In this case, non-convex regions need to be converted into convex regions. This requires additional computational costs.
A characteristic of any region is the skeleton, which is a tree. That is why finding a distance between regions is replaced by finding a distance between skeletons. We proposed a new mathematical structure - the Frechet metric between trees, which allowed us to solve the problem on computing distances between the non-convex regions.
The advantage of the Frechet metric between trees is the possibility to calculate distances between the non-convex regions. This benefit is achieved by replacing the calculation of distances between regions with the computation of distances between skeletons of the regions. The method devised and the algorithms constructed have lower computational
complexity compared to known algorithms for calculating the Hausdorff distance between the non-convex regions. Numerical simulation of the algorithms that we performed showed that the error in finding a distance based on the Hausdorff metric and on the Frechet metric between trees lies within 2-3 %. In this case, however, the time for finding distances between regions when using the Frechet metric between trees is significantly reduced, compared to the Hausdorff metric.
The first constraint for the application of the Frechet metric between trees is the necessity to find skeletons of the regions. However, modern algorithms for finding skeletons possess low computational complexity. That is why finding the skeletons of regions is not a difficult task.
The second limitation is the need to separate the roots of skeletons. This limitation is caused by the fact that the introduced Frechet metric between trees holds only to the root trees. This constraint can be eliminated by developing the Frechet metric between the non-root trees. This is not a fundamental limitation. That is why the next steps in the research imply development of the Frechet metric Frechet for the non-root trees.
Results of the present study could be used not only for testing known and new segmentation algorithms, but also for image recognition, as well as image search in databases. This significantly expands the scope of application of the constructed metric.
Acknowledgement
Present work was conducted within the framework of the state-funded program "Hybrid intelligent information technology for diagnosing precancerous breast cancer based on the analysis of images". Registration number 1016U002500.
References
Розглянута задача формування вибiрко-вих оцток кореляцшно1 матриц спостере-жень за критерieм «обчислювальна стш-тсть - спроможтсть» та проаналiзована спроможтсть стшких оцток при статичтй регуляризацп. Виявлена проблема задачi регуляризацп щток, для виршення яког запро-поновано альтернативний метод динамiчноi регуляризацп. Отримана оптимальна функ-цш динамiчноi регуляризацп вибiркових щток в умовах апрiорноi невизначеностi та надат чисельт результати
Ключовi слова: статична регуляризащя, динамiчна регуляризащя, стшк1сть, збiж-тсть, спроможтсть оцток, кореляцшна матриця
Рассмотрена задача формирования выборочных оценок корреляционной матрицы наблюдений по критерию «вычислительная устойчивость - состоятельность» и проанализирована состоятельность устойчивых оценок при статической регуляризации. Выявлена проблема задачи регуляризации оценок, для решения которой предложен альтернативный метод динамической регуляризации. Получена оптимальная функция динамической регуляризации выборочных оценок в условиях априорной неопределенности и представлены численные результаты
■а о
UDC 681.518.2, 681.514
|DOI: 10.15587/1729-4061.2017.1192641
DEVELOPMENT OF THE METHOD FOR DYNAMIC REGULARIZATION OF SELECTED ESTIMATES IN THE CORRELATION MATRICES OF OBSERVATIONS
V. Skachkov
Doctor of Technical Sciences, Professor, Leading Researcher* Е-mail: v_skachkov@ukr.net V. C h e p ky i PhD, Associate Professor, Leading Researcher* H. Bratchenko Doctor of Technical Sciences, Professor, Vice-rector for Research and International Relations*** Е-mail: bratchenkohd@gmail.com H. Tkachuk Senior Lecturer Department of fundamental sciences** Е-mail: tkachuk_el@ukr.net N. Kazakova Doctor of Technical Sciences, Associate Professor, Head of Department Department of computer, information and measurement technologies*** Е-mail: kaz2003@ukr.net *Scientific Research Laboratory** **Military Academy Fontanska doroha str., 10, Odessa, Ukraine, 65009 ***Odessa State Academy of Technical Regulation and Quality Kovalska str., 15, Odessa, Ukraine, 65020
Inversion of the correlation matrix of observations belongs to the class of problems associated with the reversion of
cause-effect relationships. This procedure is the basis for solving inverse statistical problems in applications of spectral analysis, space-time processing of multidimensional signals, control theory, identification, prediction and decision making [1-8].