MDSClone: Multidimensional Scaling Aided Clone Detection in Internet of Things

Cloning is a very serious threat in the Internet of Things (IoT), owing to the simplicity for an attacker to gather configuration and authentication credentials from a non-tamper-proof node, and replicate it in the network. In this paper, we propose MDSClone, a novel clone detection method based on multidimensional scaling (MDS). MDSClone appears to be very well suited to IoT scenarios, as it: 1) detects clones without the need to know the geographical positions of nodes; 2) unlike prior methods, it can be applied to hybrid networks that comprise both static and mobile nodes, for which no mobility pattern may be assumed a priori. Moreover, a further advantage of MDSClone is that 3) the core part of the detection algorithm can be parallelized, resulting in an acceleration of the whole detection mechanism. Our thorough analytical and experimental evaluations demonstrate that MDSClone can achieve a 100% clone detection probability. Moreover, we propose several modifications to the original MDS calculation, which lead to over a 75% speed up in large scale scenarios. The demonstrated efficiency of MDSClone proves that it is a promising method towards a practical clone detection design in IoT.


MDSClone: Multidimensional Scaling Aided Clone Detection in Internet of Things
Po-Yen Lee, Chia-Mu Yu , Tooska Dargahi , Mauro Conti, Senior Member, IEEE, and Giuseppe Bianchi Abstract-Cloning is a very serious threat in the Internet of Things (IoT), owing to the simplicity for an attacker to gather configuration and authentication credentials from a non-tamperproof node, and replicate it in the network.In this paper, we propose MDSClone, a novel clone detection method based on multidimensional scaling (MDS).MDSClone appears to be very well suited to IoT scenarios, as it: 1) detects clones without the need to know the geographical positions of nodes; 2) unlike prior methods, it can be applied to hybrid networks that comprise both static and mobile nodes, for which no mobility pattern may be assumed a priori.Moreover, a further advantage of MDSClone is that 3) the core part of the detection algorithm can be parallelized, resulting in an acceleration of the whole detection mechanism.Our thorough analytical and experimental evaluations demonstrate that MDSClone can achieve a 100% clone detection probability.Moreover, we propose several modifications to the original MDS calculation, which lead to over a 75% speed up in large scale scenarios.The demonstrated efficiency of MDSClone proves that it is a promising method towards a practical clone detection design in IoT.
Index Terms-Network security, clone attack, internet of things, multidimensional scaling.

I. INTRODUCTION
I NTERNET of Things (IoT) is an emerging networking par- adigm, in which a large number of interconnected devices is composed of several smart sectors, such as [2] smart homes, smart hospitals, and smart cars, which are significant applications of IoT.On account of their restricted features and capabilities, IoT devices are vulnerable to several security threats [3].For example, IoT devices could easily be captured, leading to a clone attack (also known as a node replication attack).In such a scenario, the captured device is reprogrammed, cloned, and placed back in the network.Moreover, in special cases (e.g., misconfiguration or production by untrusted manufacturers with adversarial intentions) devices that are supposed to be trusted can cause clone attacks [4].A clone attack is extremely harmful, because the clones with legitimate credentials will be considered as legitimate devices.Therefore, such clones can easily perform various malicious activities in the network [5], [6], such as launching an insider attack (e.g., blackhole attack) and injecting false data leading to hazards in an IoT scenario.

A. Problem Statement
While there exists fairly extensive literature on clone attack detection approaches in WSNs [7], [8], this remains an open problem when it comes to IoT scenarios.In particular, two unique characteristics of IoT environment make the establishment of clone detection schemes in IoT a more challenging issue.First, there could be a lack of accurate geographical position information for the devices.Second, IoT networks are hybrid networks composed of both static and mobile devices without a priori mobility pattern (they can be static or moving with high or low velocity) [9].Although some of the existing clone detection methods for mobile networks (e.g., [10]- [12]) could be applied to hybrid networks (composed of both stationary and mobile devices), these suffer from a certain detection probability degradation.

B. Contribution
In this paper, we propose MDSClone, a novel clone detection mechanism for IoT environments.MDSClone specifically circumvents the two major above-mentioned issues that emerge in IoT scenarios by adopting a multidimensional scaling (MDS) algorithm [13], [14].In particular, our main contributions are as follows.
1) Our proposed MDSClone method does not rely on geographic positions of nodes and is capable of detecting clones in the network based on topology distortion, without considering any specific mobility pattern.
2) We show that MDSClone is efficient in terms of the computational overhead, because the main computation is performed by the base station (BS), and the server-side computation can easily be parallelized to significantly improve the performance.3) Along with the main MDSClone algorithm, we also propose three techniques (i.e., CIPMLO, TI, and SMEBM) to speed up the core part of MDSClone, which comprises the MDS calculation.4) We provide a thorough evaluation of our proposed method considering different evaluation criteria, i.e., the clone detection probability and computation time of our algorithm when adopting our proposed speed-up methods.

II. RELATED WORK
In this section, we review the clone detection methods that are most closely related to our work, and clarify the difference between our proposal and the existing related work.
In the case of static networks, a popular approach for detecting clones is witness finding.In essence, the idea behind witness finding is that the existence of clones must lead to location conflicts.More specifically, each node u collects the location information, L(v), of its neighboring nodes, e.g., v, and sends the collected location claims v, L(v) to some selected nodes.Nodes receiving two location claims with the same ID v, but with two distinct locations, will serve as witness nodes, and witness the location conflict.The witness finding strategy not only detects the existence of clones, but also identifies the clone IDs.
A network-wide broadcast is the simplest way to find a witness, but this incurs a prohibitive communication cost.Parno et al. [15] proposed two approaches, randomized multicast (RM) and line-selected multicast (LSM), in order to reduce the communication costs of network-wide broadcasts.Two other approaches proposed in [16], i.e., single deterministic cell (SDC) and parallel multiple probabilistic cells (P-MPC), share the same spirit as RM and LSM.However, SDC and P-MPC are only efficient when the network is partitioned into cells.Compared with the aforementioned approaches, the protocol proposed in [17], i.e., the randomized, efficient, and distributed (RED) protocol, provides an almostperfect guarantee of clone detection.RED utilizes a special centralized broadcasting device, such as a satellite and UAV, in order to periodically broadcast the node IDs responsible for detecting particular conflicting location claims.In another study, Zhang et al. [18] proposed four clone detection methods that take advantage of double ruling and the Bloom filter.Recently, Dong et al. [19] proposed the low-storage clone detection (LSCD) method, taking into account the memory requirements and residual energies of nodes.An inherent weakness among all of the witness finding-based approaches is the assumption of the knowledge of location information available for each node.A couple of solutions take alternative approaches to detect clones, such as the social fingerprint [20], predistributed keys [21], and random clustering [22] methods.
In the case of mobile sensor networks, by using a simple challenge-and-response strategy, XED [10] presents the first distributed clone detection method for mobile networks.However, it is vulnerable to collusions of the cloned nodes.EDD [10], [11] is a distributed clone detection method based on the discrepancy between the distributions of the numbers of encounters with clone and ordinary nodes.In [23], a base station (BS) collects the geographical positions of nodes, looking for a clone moving with a speed exceeding the preconfigured speed limit.In [5] and [12], the same idea is employed, but the ordinary nodes play the role filled by the BS in [23].
In summary, the existing clone detection methods devised for static networks cannot be applied to scenarios where node mobility would destroy the neighborhood and distance relations among the nodes.On the other hand, as mentioned above, the adoption of most of the mobile clone detection methods to hybrid networks results in a degradation of the clone detection probability.Therefore, in order to deal with clones in IoT environments, we need to provide a method that is "particularly" designed for hybrid networks, and does not rely on any assumption regarding the mobility pattern, if any.In addition, prior solutions largely rely on the assumption that each node is aware of its geographical position.However, as explained in Section I, this is not the case for IoT devices.As a consequence, the existing clone detection methods are not applicable to IoT environments.Table I presents a comparison between MDSClone and the other existing clone detection schemes, in terms of the communication and memory overhead, required information, and network type.

III. SYSTEM MODEL
In this section, we describe our considered network model and assumptions (Section III-A), as well as the attack model (Section III-B).

A. Network Model
We consider an IoT network as a hybrid network consisting of two main entities: 1) n static and mobile nodes with unique IDs [26]: ID ∈ {1, . . ., n}; and 2) a base station (BS).Each IoT device periodically measures its distance with its neighboring nodes, and sends the information to the BS.In our system model, the BS is in charge of executing our proposed MDSClone algorithm and locating the "clones" (for a definition please refer to Section III-B) in the network.In particular, the BS periodically receives neighboring information for each node in the network, and constructs a location map (based only on the information received from the nodes) in order to detect clones (we explain the details of the MDSClone algorithm in Section V-A).The BS executes MDSClone offline, and each generated location map is dedicated to a snapshot of the network at time t.The main idea in our proposed method is that at time t, a node x cannot have two different sets of neighbors, which means that x cannot be in two different locations of the network at time t.In our network model, we make the following assumptions.(i) Nodes are not necessarily aware of their exact geographical position.(ii) Mobile nodes are moving without any particular mobility pattern.(iii) IoT devices are capable of enacting short-range device-to-device communication.Therefore, each node can measure its distance from its neighboring nodes via radio signal strength (RSS) or time of arrival (ToA).Although the estimated distances are not perfectly accurate, they are sufficient for our approach.(iv) The BS knows the geographic positions of IoT devices at the very beginning (only during the initialization of the network).However, after the network deployment, the BS is no longer aware of the positions of the devices.(v) There exists a loose time synchronization between the nodes, and the network operation time is divided into time intervals, each of which has the same length.(vi) The exchanged messages are digitally signed before being sent out, unless stated otherwise.We have studied the practicality and efficiency of such operations in [27] and [28].

B. Attack Model
IoT devices are usually considered not to be tamperresistant [29]; the stored security credentials can all be extracted in the case of a device being compromised.In this paper, we consider an adversary that is capable of performing "clone attack," meaning that they are able to fabricate compromised devices and store the legitimate credentials from the compromised devices inside several fabricated devices, which is (consistent with related work on clone detection such as [10]).A compromised node, as well as the fabricated nodes that have the same ID and credentials as the compromised node, are called clones.Clones can communicate and collude with each other, attempting to subvert the detection functionality in a stealthy manner.
Rather than a generic clone model consisting of s clone groups, each of which contain at most z clones, we consider a simplified clone model, similar to [10].In our model, there is only one clone group, with exactly two clones having the same ID.The clone ID refers to the ID of two clones in a specific clone group, unless stated otherwise.The use of such a simplified model is to ease the presentation of our main idea, while our method can naturally be applied to a generic clone model without compromising the security.

IV. PRELIMINARIES
Before introducing our proposed clone detection method, we provide a brief background regarding MDS in Section IV-A, which serves as the foundation of our approach.A localization method using MDS that we describe in Section IV-B, called MDS-MAP, provides a core subroutine in our scheme.

A. Multidimensional Scaling
Multidimensional scaling (MDS) [13] is a hyperspace embedding technique, through which pairwise distances are fit into a set of coordinates with the preservation of distance restrictions.More concretely, MDS takes a distance matrix D as input, which is formed from the distances between all pairs of nodes.The output of MDS is a set of coordinates created using only D. The first step is to calculate an inner product matrix B = C AC, which satisfies the relation B = C AC = X X T , where I is an identity matrix, E is a column vector composed of 1's, and X is a coordinate matrix with each row being a p-dimensional coordinate.One can easily observe that B is a real-valued and symmetric matrix, and hence we can apply orthogonal diagonalization to B to obtain where is composed of the corresponding orthogonal eigenvectors.Owing to the fact that B = X X T , we can obtain the reconstructed coordinate matrix X by calculating However, the coordinate matrix X reconstructed by MDS is not necessarily identical to X.In essence, X is only guaranteed to preserve the pairwise distances D, but is subject to translations (shifts), rotations, and reflections.In other words, X and X , where we write X = X , can both act as the reconstructed coordinate matrix if X and X can induce the same D.

B. Localization via MDS
Given a network, MDS-MAP [14] is a localization algorithm executed by the BS.In particular, MDS-MAP takes a subset of pairwise distances of the nodes as input, and generates the coordinates of the nodes in the network.The difference between the ordinary MDS and MDS-MAP lies in the fact that the calculation of MDS assumes that the BS has the knowledge of all pairwise distances.However, this assumption is not realistic, particularly in wireless networks.Thus, MDS-MAP combines the techniques of MDS and a shortest path calculation from graph theory to approximate the ordinary MDS.More specifically, in the case where nodes i and j are far away from each other and i cannot obtain a measured distance from j , the BS instead obtains an approximate d i, j (i.e., the distance between i and j ) by calculating the corresponding shortest path.Using this approach, the BS can easily obtain all of the pairwise distances, although some of these are approximate.Next, the BS performs ordinary MDS on the pairwise distances to derive the coordinate matrix and accomplish the localization.Although the approximate distances comprising the input to MDS may cause a certain distortion in the reconstructed coordinates, Shang et al. [14] demonstrated the acceptable reconstruction accuracy of MDS-MAP. Figure 1 shows an illustrative example of the MDS-MAP process.In Figure 1a, each node measures its distance from its neighbors and reports this to the BS.Then, the BS uses the MDS-MAP reconstructed coordinates as the nodes' positions to construct the network map, as shown in Figure 1b.

V. PROPOSED METHOD
In this section, we describe MDSClone, our proposed method for clone detection.In particular, we explain the basic construction of MDSClone in detail in Section V-A.Then, in Section V-B we describe several improvements to our main construction to yield a more efficient clone detection algorithm.Note that although we mainly use MDS-MAP [14] to calculate the coordinates of IoT devices throughout the paper, we only use the term "MDS" in the remainder of the paper, for representational simplicity.

A. Main Construction of MDSClone
The idea behind our proposed MDS-based solution, MDSClone, is inspired by the following observation: When each node reports its neighbor-distance information, consisting of its neighbor list along with the measured pairwise distances, to the BS, the BS can construct a node map1 via MDS without the need to know the exact location information of the nodes.Note that the node map refers here to a set of coordinates of IoT devices, and corresponds to the coordinate matrix X in Section IV-A.In the case that no clones are present in the network, a coordinate matrix X will be generated such that the collected pairwise distances can be approximately preserved (i.e., each entry of the matrix D − D is close to 0, where D is the distance matrix calculated from X ).On the other hand, consider a network with a clone.From the BS's perspective, if the information reported from devices collectively contains two nodes with the same ID but completely different neighbor lists, then the reconstructed node map X will be distorted.More precisely, because two clones can be thought of as two identical nodes that simultaneously appear at two distant locations (e.g., node B in Figure 2a), at least one additional dimension is required in X to achieve distance preservation.Because the number p of dimensionalities should be a fixed and public parameter, or we may even restrict ourselves to two-dimensional MDS reconstruction ( p = 2), it follows that a distortion in the reconstructed map is unavoidable (see Figure 2b).As a consequence, from the perspective of clone detection, the failure of MDS in constructing a node map that achieves distance preservation indicates the existence of clones in the network.To identify clones, the BS may execute MDS multiple times, excluding different node IDs.For example, if the MDS calculation for nodes {1, . . ., n} results in an erroneous node map, and the MDS calculation excluding node i (i.e., on nodes {1, . . ., (i −1), (i +1), . . ., n}) achieves a perfect node map reconstruction, then node i must be a clone, because it caused the distortion in the MDS.
In what follows, we describe three main design challenges that appear in adopting MDS for clone detection, and explain how MDSClone addresses these challenges.
Algorithm 1 MDSClone Performed by the BS 1: Input: t, i, {( j, d i, j )} j ∈N i : neighbor-distance information received from node i ; λ: distortion threshold.
For ρ = 1 . . .n 7: : : Nodes π 1 , . . ., π ρ are identified as clones 11: Calculate M(L t , L t −1 ) to locate clones π 1 , . . ., π ρ • First, BS requires the pairwise distances of all the nodes in the network in order to run the MDS algorithm.However, such information is not available.Therefore, the first challenge is to enable the BS to perform the MDS calculation using only a "subset" of pairwise distances.The reason behind this challenge is that in an IoT network, each IoT device can only estimate its distance from its neighboring nodes, e.g., via RSS.Hence, the neighboring information reported to the BS does not include the pairwise distances of all nodes in the network.We address this challenge by using the shortest path between two nodes in order to approximately calculate the Euclidean distance between them (inspired by MDS-MAP [14]).
• The second challenge is to design a localization function (i.e., M(L t , L t −1 )) in order to "locate" the clones in the network.The reason behind this challenge is that the node map reconstructed by the BS is not necessarily identical to the real positions of nodes (although the pairwise distances are guaranteed to be preserved).In order to address this challenge, we consider two different cases (i.e., the existence of anchor nodes and lack of anchor nodes), as we explain in Section V-A 1 c.• The third challenge is the computational overhead imposed on the BS.The reason behind this challenge is that the BS must perform the MDS calculations iteratively in order to find clones.In particular, the BS has to perform, on average, O(n c ) rounds of MDS calculations (where n is the number of nodes in the network and c is the number of clones).We address this challenge by proposing two strategies in MDSClone (as explained in Section V-B): (i) reducing the MDS computational overhead, and (ii) performing the MDS calculation on several server-side devices in a parallel manner.

1) Detailed Description of MDSClone:
The algorithmic description of MDSClone is presented in Algorithm 1.As can be seen, BS is in charge of running the algorithm and recognizing the existence of a clone in the network.Each node i in the network discovers its neighboring nodes N i , measures the distance {(d i, j )} j ∈N i with each of its neighboring nodes, and sends this neighbor-distance information t, i, {( j, d i, j )} j ∈N i to the BS (comprising the input of Algorithm 1) at time t. 2ere, the message t, i, {( j, d i, j )} j ∈N i can be thought as a star-shaped subgraph whose nodes are N i ∪ {i }.The purpose of this step is for the BS to collect the subset of pairwise distances, similar to the case in MDS.With the neighbordistance information of the nodes and a pre-defined distortion threshold (which we will explain in Section V-A 1a) as input, the BS periodically executes Algorithm 1.
We assume that the BS maintains a table L for storing the received neighbor-distance information.After receiving the messages {t, i, {( j, d i, j )} j ∈N i } i=1,...,n , the BS stores the starshape subgraph induced from t, i, {( j, d i, j )} j ∈N i at time t in the t-th row, L t , of the table L (steps 2 and 3 of Algorithm 1).Hence, the t-th entry L t of the table L consists of the neighbordistance information sent by all the nodes at time t.Then, the BS has to perform MDS over L t to check whether there are clones present (step 4 of Algorithm 1).More specifically, if the Boolean distortion function D(λ, L t , X t , Lt ), that measures the dissimilarity between pairwise distances from L t and X t (we will explain this function in SectionSection V-A 1 b)) returns true (step 5 of the Algorithm 1), this indicates a significant distortion in the reconstructed node map.In this case, the BS recognizes clones by applying MDS iteratively to In essence, this is equivalent to applying MDS over the induced subgraph.During the MDS calculations, after the BS has determined a reconstructed node map whose pairwise distances are not consistent with the collected pairwise distances (step 9 of Algorithm 1), the BS notices that the excluded nodes in the current calculation are clones (step 10 of Algorithm 1).By calculating the localization function M(L t , L t −1 ), the BS identifies and locates the clones (step 11 of Algorithm 1).
a) Choice of λ: The distortion threshold, λ, controls the trade-off between the tolerance of the MDS reconstruction error, the computational burden on the BS, and the capability of MDSClone to identify clones.In essence, in the case of a small λ value, MDS reconstruction inaccuracies may be regarded as clones, which leads to repeated MDS calculations in order to find these clones.On the other hand, an inappropriately large λ value may result in the misdetection of clones.As can be seen, determining a suitable λ value is important, and in fact a high sensitivity of the MDS reconstruction accuracy leads to a greater capability in identifying clones.In order to address these challenges, in the following we propose a data-driven method to determine an appropriate λ value.Hence, there will be rare cases in which clones could successfully evade detection owing to an inappropriate selection of λ.
Owing to the fact that IoT devices are all owned by the IoT network owner, all the characteristics of the IoT devices, such as their radio ranges, are available to the network owner.Therefore, the network owner is able to virtually deploy IoT devices in a random manner, and then perform the MDSClone calculation on the virtual deployment.Here, we make two observations.First, even in the virtual deployment without clones, the distance matrix D derived directly from L t will be slightly different from the distance matrix D derived from the MDS-MAP reconstruction result.Second, in the presence of clones in the network, the discrepancy between D and D must be significant, because clones are usually not close to each other, in order to have a more negative impact on the network.As a result, to determine a suitable λ, the network owner sets up x virtual deployments without clones, and chooses the maximum discrepancy value among the x discrepancy values as λ, where x is a sufficiently large value.More precisely, let , derived in the i -th virtual deployment (a more detailed description of A(D , D) can be found in Section V-A.1.b).The distortion threshold λ is set as the maximum distortion, 3 i.e., λ = max{A(D , D) 1 , . . ., A(D , D) x }.
b) Construction of D(λ, L t , X t , Lt ): Similar to λ, the distortion function, i.e., D(λ, L t , X t , Lt ), has direct impact on the trade-off between the tolerance of MDS reconstruction error, computation burden on the BS, and capability of MDSClone in identifying the clones, but the corresponding construction is still unclear.
Here, we propose an algorithm as an implementation of D(λ, L t , X t , Lt ).More specifically, the algorithm takes as input a distortion threshold λ, the received neighbor-distance information L t , and the reconstructed node map X t , and outputs an indication of whether the pairwise distances from L t are inconsistent with the ones from X t .In particular, with the shortest path as an approximation, the BS can calculate pairwise distances D = {d i, j } i, j ∈{1,...,n} from L t \ Lt .X t can also be used to generate estimated distance matrix D = {d i, j } i, j ∈{1,...,n} .Hence, D(λ, L t , X t , Lt ) returns true (significant inconsistency between D and D , indicating clone ≥ λ and false otherwise.Note that A(D , D) here is defined based on the least square error criterion, while, it is possible to adopt other error metrics.
c) Construction of M(L t , L t −1 ): After identifying the clones, one option is to announce and revoke clone IDs.In this case, since the nodes physically remain in the network, the attacker might use them in order to conduct other clone attacks or other types of attack, such as blackhole or jamming attacks.Another option is to locate clones and then physically remove them or use one of the existing attestation techniques in the literature (such as [30]).In order to protect the network against further attacks, the latter case is preferable.In this latter case, we need to develop techniques for locating clones in the network, and for that matter we introduce the localization function M(L t , L t −1 ), which we detail in the following.
We consider two constructions for M(L t , L t −1 ): (i) Anchor case: a network having at least two static nodes, (ii) No anchor case: without having such restriction.The idea behind the two constructions of M(L t , L t −1 ) is node alignment: once the real location of nodes at the previous time are known and the alignment between L t and L t −1 can be made, we can easily infer the real locations at the current time.In what follows, we explain the considered two scenarios.
Anchor Case: Here, we assume at least two static nodes in the network and we observe that the static nodes can act as anchor points for calculating the transformation matrix P. When P is known, one can obtain the real location of nodes by applying node alignment to L t and L t −1 .
Consider the case where the BS applies MDS to nodes 1, . . ., n.The transformation matrix can also be applied only to a subset 1, . . ., s, where 1, . . ., s are static nodes, without loss of generality.In other words, if A closer look to the currently used MDS reveals that the MDS works in a restricted form; the mean of reconstructed coordinates of the MDS result will always be shifted to [0 0], in the two-dimensional case.This eliminates the need for handling the translation operation, because one can shift the locations at the previous time to be centered at [0 0], perform the node alignment, and perform the reverse shift on aligned locations.In this sense, we can therefore take care of the rotation and reflection operations only.Since the MDS is primarily featured by an orthogonal diagonalization (see Section IV-A), we simply find an orthogonal matrix to solve the problem.In particular, we look for an orthogonal matrix P such that X = X P with the property of X X T = X P(X P) T = X P P T X T = X X T = B. ( To find such a P, we randomly choose two static nodes from 1, . . ., s, and their corresponding reconstructed coordinates.The transformation matrix P can be calculated by considering the relation between two chosen nodes, x 2 y 2 P, and can be derived as follows, Note that the determinant of P, det(P), may determine the operations applied to nodes.Note also that an odd number of reflections implies det(P) = 1, while the other cases (e.g., only rotation or an even number of reflections) imply det(P) = −1.After obtaining P, we perform node alignment to keep track of the clone movement.In particular, the reconstructed coordinates will be transformed via P to be consistent with the coordinates at the previous time.
No Anchor Case: In this scenario, we do not have anchors and instead seek a construction of M(L t , L t −1 ) for mapping two sets of points with the guarantee of minimal nodewise discrepancy.The algorithm for the node alignment is inspired by the problem of point set registration.The point set registration [31] has been widely used in computer vision community for finding a spatial transformation that aligns two sets of points.In particular, the point set registration merges two data sets into a globally consistent model by mapping a new measurement to a known data set.The Iterative Closest Point (ICP) algorithm [32] is the most straightforward way for minimizing the difference between two clouds of points.However, ICP works only in the case of a rigid registration (or say, rigid transformation), which typically consists of translation and rotation.The point set registration can also be used in our case where the nodes location at the previous time are mapped to the ones at the current time.However, our problem can be thought as a variant of the conventional point set registration.The differences are stated below: • While the size of two sets of points in the conventional point set registration problem are allowed to be different, in our case they are guaranteed to be the same.• While one set of points will be deterministically transformed from another set, in our case due to the consideration of node mobility, we may consider the option of either imperfect transformation or noisy transformation.
Moreover, the MDS-reconstructed node map is subject to not only translation and rotation but also reflection operations, as shown in Section IV-A.Thus, inspired by the robust affine transformation consisting of a richer set of transformations including translation, rotation, and reflection, we resort to affine variant of ICP, affine ICP, to find the transformation matrix P between { t −1 1 , . . ., t −1 n } and { t 1 , . . ., t n }, where t −1 i and t i are the coordinate of the i -th node at (t − 1)-th time and t-th time, respectively, with T being the coordinate matrices at (t − 1)-th time and t-th time, respectively.
Since affine transformation is a richer set of transformations compared to the three transformations required in our context, our problem can still be recasted as the affine registration of two two-dimensional point sets.More formally, based on the least-square error criterion, the affine registration between two point sets can be formulated as min (5) However, the transformation P in Equation ( 5) is too vague; it needs to be expressed explicitly so as to simplify the objective function in Equation (5).With the fact that the affine transformation P can be decomposed into an invertible matrix together with a translation vector s, Equation ( 5) can be rewritten as min ,s, j ∈{1,...,n} n i=1 Note that the presence of anchors eliminates the need of handling coordinate shift but the lack of anchors does not.Furthermore, an invertible real-valued matrix can be decomposed via singular value decomposition (SVD) into = U SV T , where U and V are orthogonal matrices and S is a diagonal matrix with singular values.Because V T is orthogonal matrix, we define the rotation matrix R = V T .After all, the objective function in Equation ( 6) can be rewritten as min U,S,R,s, j ∈{1,...,n} Recall that an affine transformation is a combination of a series of basic transformations, such as translation, rotation, and reflection.In Equation ( 7), U and R represent reflection and rotation matrices, respectively, while S is a scale matrix.
Though U and R are orthogonal matrices, the collection of nodes does not scale differently in our consideration.Considering the above constraints, the unconstrained optimization in Equation ( 7) can be transformed to a constrained optimization, min U,R,s, j ∈{1,...,n} Inspired by robust affine ICP [31], the optimization in Equation ( 8) can be solved in the iterative algorithm shown in Algorithm 2. Algorithm 2 outputs rotation matrices U k , R k , and translation vector s k .More specifically, given the , the step 3 of Algorithm 2 aims to establish the correspondence of point sets: where i = 1, . . ., n.Note that, step 3 of Algorithm 2 can be solved by methods such as k-d tree [33] and nearest point search based on Delaunay tessellation [34].Given the found correspondence c k (i ), step 5 of Algorithm 2 computes the k-th Algorithm 2 Node Alignment Based on Affine ICP 1: INPUT: k is initialized as 0; U 0 , R 0 , and s 0 are randomly chosen; is the considered threshold.2: Repeat 3: Compute the correspondence (Equation ( 9)) by using an orthogonal group consist of a set of real 2 × 2 orthogonal matrices.Considering the Taylor series of exponential mappings on U k , R k ∈ L(2), step 5 of Algorithm 2 can be formulated and solved using quadratic programming.Lemma 1: Algorithm 2 converges monotonically to a local minimum, with respect to the mean squared distance from any given initial parameters.
Algorithm 2 shares the same spirit as robust affine ICP [31].Nevertheless, due to its similarity to robust affine ICP and the lack of space, we omit the formal proof of Lemma 1 here.

B. Techniques for Efficiency Improvement of MDSClone
As we explained in Section V-A, the BS should execute the Algorithm 1 in order to check if the network is under clone attack, and in case of having clones, BS should execute MDS function (steps 7 to 10 of Algorithm 1) multiple times to identify the clone IDs.In such situation, BS has to perform, on average, O(n c ) rounds of MDS calculations 4 to find the clones, provided that c clones exist in the network.Though the MDS computation is very fast, the iterative calculation of MDS may still impose a huge computation overhead on the BS.For example, the execution of the MDS on a matrix of dimension 10 4 ×10 4 takes approximately two minutes.In the case of only one clone in a network of 10 4 IoT devices, the BS needs to perform the MDS, on average, 5 × 10 3 times, which requires hours or even days for detecting the clones.In what follows, in order to address this issue, we propose an improvement to the calculation of the MDS function.In particular, we show that the MDS calculation can be parallelized and offloaded on several powerful servers, or devices, each of which calculating one of the required iterations that results in speeding up the whole clone detection algorithm.We show that our proposal significantly reduces the computation load on the BS, leading to improved scalability and performance of the clone detection 4 In the extreme case, assume that there is one clone group consists of c clones in the network.Since these c clones share the same ID (where ID∈ {1, . . ., n}), the MDSClone has to perform the MDS calculation for O(n) times (on average n 2 times) to identify the clone ID.Consider another extreme case, where c clone groups exist in the network.Since the BS has no knowledge on the number of clones (i.e., c), the BS has to scan every possible combination of clone IDs, leading to method.We provide detailed explanation of our improved implementation in the following.
1) Speeding Up the MDS Calculation: In general, a significant portion of computation overhead in the MDS calculation is incurred by computing eigenpairs.Here, the eigenpairs mean the pairs of eigenvalues and eigenvectors in Equation (1).In addition, obviously the computation load in the MDS is proportional to the number of nodes in the network.To ensure the scalability of MDSClone, with the observation that the inner product matrix B in our context is always real-valued and symmetric, we propose three techniques to improve the performance of the MDS calculation in MDSClone algorithm: a) CIPMLO: aims at computing inner product matrix with less arithmetic operations.b) TI: is using modified Householder transformation [35] to speed-up the calculation of eigenpairs.c) SMEBM: is closely relevant to TI, and basically speedsup TI.
Each of these three techniques leads to certain extent of the speed-up.Among them, CIPMLO and TI can be executed individually, while SMEBM is useful only when it is used along with TI.We detail each of these three techniques in the following.

a) Computing Inner Product Matrix With Less Operations (CIPMLO):
This technique, computes inner product matrix B with less arithmetic operations.Starting with a concrete example is helpful in giving an idea of how CIPMLO achieves the speed-up.Assume that we have a distance matrix D such that ( We derive matrices C and A as follows, We calculate the matrix = C • A. A partial view of is shown in the following, With the observation that, the elements in the i -th column of are related to the element i,i , we can derive non-diagonal elements by using diagonal elements.In essence, the matrix can be represented in a more formal way as follows.
where D i = n j =1 A j,i and N k is the non-diagonal element in the lower triangular part of A, with the elements of A being re-numbered as Let be Similarly, we find that each element of B has relation with the sum of row elements of .In fact, B can be expressed in the following form, where R i is defined as R i = S − n D i with S = n i=1 D i .Then we prove that R i is the sum of the i -th row of .
Lemma 2: R i is the sum of the i -th row of .
. Then, we prove that C calculated in our proposed procedures remains symmetric.
, where the first equality and second equality are due to equations ( 18) and ( 14), respectively.
Since B = C is symmetric, we can calculate the elements of only upper or lower triangular part.In addition, we find that the upper triangular part of B in Equation ( 18) is only dependent on the upper triangular part of in Equation 16.In other words, we can calculate only a half of elements of and then compute B with approximately half of the computing burden, compared to the original computation task.Such calculation of B is shown below.
Note that the third equality is due to Equation (18).b) Tridiagonalization Improvement (TI): Before calculating eigenvalue decomposition, the existing library for eigenvalue computation [36] introduces a pre-processing phase for computation reduction.In particular, when the matrix is symmetric, one can apply Householder transform to the input matrix and obtain a tridiagonal matrix.After that, the eigenvalue decomposition applies to derive the eigenvalues and eigenvectors.In our context, we focus only on B, which is naturally symmetric.Consequently, our second proposed technique, TI, to speed-up the MDS calculation is due to the performance improvement of matrix tridiagonalization.
Basically, TI achieves the speed-up because some matrix multiplications can be replaced by matrix additions and inner product calculations.We also start with how Householder transform works, to better illustrate the basic idea behind the design.Let A = A 0 be a real-valued symmetric matrix of dimension n × n.We can reduce A to the tridiagonal form A n−2 by iteratively using decomposition, A m = P m A m−1 P m , m = 1 . . .(n − 2), where P m = H n−m 0 0 I m is an orthogonal matrix with I m being an m×m identity matrix and H n−m being a Householder matrix defined as x n−m −σ e n−m is a Householder vector, where x n−m is a column vector composed of the first n − m elements of the m-th column of A, σ = −sign(x • )x n−m is the length of x with x • being the last element of x n−m , e n−m = (0, 0, . . ., 1) T is the last standard basis vector of dimension (n − m) × 1, and sign(x • ) is defined as Afterwards, we can generate the tridiagonal form of a given symmetric matrix.More concretely, consider the first iteration of Householder transformation of A, and sqrt = e 2  4 + e 2  7 + (e 9 + σ ) 2 .Then, we obtain As concrete examples, we list some elements of P 1 A. . ( Before describing our approach based on modified Householder transformation, we highlight some observations from equations ( 24) and ( 25) as follows: 1) The elements with the same column have similar calculations.2) All elements of P 1 A share some common operations (part (a)).
3) The elements in part (b) are the elements from matrix A. 4) The calculations in part (c) are the same, as well as the calculations in part (d).5) The remaining parts, e 4 and e 7 , are elements of vn−1 .TI mainly speeds-up some parts of the numerators of the elements in P 1 A (e.g., parts (c) and (d)), but for the time being, we only focus on part (c).Considering e 1 e 4 + e 2 e 7 + e 3 (e 9 + σ ), we recognize that this it is the inner product of vT n−1 and [ e 1 e 2 e 3 ] T (i.e., part of matrix A).So, T , and X 256 = vT n−1 • [ e 2 e 5 e 6 ] T .Let V r be the inner product of vT n−1 and [ A r,1 A r,2 A r,3 ].Then, where ⊗ denotes the element-wise multiplication and E is the vector of all 1's.By considering equations ( 23), (24), and ( 25), the same observations and procedures can also be applied to P 1 A P 1 (= A 1 ) in order to obtain where S = −2 sqrt 2 and We can see from equations ( 28) and ( 29) that for the elements of the same column in A 1 , they perform the same operations.For example, the multiplication with S (part (e)) and an inner product (part (f)) are common operations.We can also see that the elements in part (g) are from matrix A. Let Vi be vn T .Thus, A 1 can be formulated as Then, in the following lemma, we prove that A 1 calculated in the above way is symmetric.
Lemma 4: A 1 calculated in the above way is symmetric.Proof: According to the property of Householder transformation, after the first iteration, A 1 can be expressed in the following form, In the following lemma, we formally prove that the upper-right entry of A 1 calculated in the above way is zero.Lemma 5: The upper-right entry of A 1 calculated in the above way is zero.
Proof: Assume (P 1 A P 1 ) i, j = 0, where i ≤ j − 2 and We can see from Equation ( 30), Lemma 4, and Lemma 5 that the original matrix multiplications in Householder transform can be replaced by matrix additions and inner product calculations.The same observations and procedures can also be applied to A 2 , A 3 , and so on.As a consequence, we have the following corollary.
Corollary 1: The matrix multiplications involved in the calculation of A i , where i = 1 . . .(n − 2), can be replaced by inner product and matrix addition calculations.
Here, our proposed TI for Householder transformation achieves the speed-up by taking advantage of the computationally cheaper inner product and matrix addition calculations.Compared to the current software implementation, EISPACK [36], our proposed algorithm needs only half computation.
c) Searching for Meaningful Eigenpairs of Block Matrix (SMEBM): If the reconstructed coordinates are constrained to be two-dimensional, in fact, two largest eigenpairs suffice to reconstruct X .Note that "meaningful eigenpairs" here refer to those eigenpairs that have impact on the coordinates of our interest.More specifically, as stated previously, we perform tridiagonalization and eigenvalue decomposition to derive X .However, we observe that the two-dimensional node map is only affected by specific elements.As a result, the idea behind SMEBM is that we only calculate the entries that may affect the eigenpairs that have impact on the two-dimensional coordinates.One distinguishing feature of SMEBM is that one may need n − 2 iterations of calculations of A 1 , . . ., A n−2 in TI, however, only two or three iterations in SMEBM are needed.
Given that all the elements of A = A 0 are very small, we observe from the tridiagonalized matrix that only some values of small block submatrix are meaningful.In particular, there are two forms of block submatrices.
The first form is a 3 × 3 block submatrix (see the former part of Equation ( 32)), containing two meaningful eigenvalues, where an eigenvalue is close to zero and will always be at the bottom-right corner.The second form is composed of two 2 × 2 block submatrices (see the latter part of Equation ( 32)): they can be found in diagonal part of tridiagonalized matrix, an one of them is guaranteed to be at the bottom-right corner.Furthermore, in the case of two 2 × 2 block submatrices, each block submatrix contributes one meaningful eigenpairs.The reason why we can only focus on the block submatrix is that all the elements of tridiagonal matrix are close to zero except the elements of block submatrices.
To better illustrate the idea behind SMEBM, we start with a concrete example.Consider a 5 × 5 tridiagonal matrix in Equation ( 33) with a 3 × 3 block submatrix and " * " elements between 10 −10 and 10 −16 .
Since " * " is very close to 10 −10 and 10 −16 , we consider them to be zero.So, the eigenvalues of tridiagonal matrix can be easily extended to be the eigenvalues of matrix O 0 0 by padding zero submatrix O.The above works only in the case where each element is sufficiently small.Nevertheless, a preprocessing step can be used to counteract the above problem.After calculating the inner product matrix, we scale it by calculating B α , where α = n d=1 B d,d + n−1 r=1 n c=r+1 B r,c .Because of the above preprocessing on the inner product matrix B, the eigenvalues calculated after the scaling, the values of μ i , will not be the same as original ones, μ i .However, μ i can be recovered by calculating μ i = αμ i , i = 1 . . .n. Since we only need the block submatrix, we can reduce some steps when performing Householder transformation.In particular, for a 3 × 3 block submatrix, we just need to transform the inner product matrix twice.For two 2 × 2 block submatrices, we need to transform the inner product matrix two or three times.
From Equation (34), we can see that, after one iteration of Householder transformation, the positions with "+" mark will become zero, and the elements on the positions with " * " mark will be changed because they are multiplied by Householder matrix.In the case of a 3 × 3 block submatrix, e 2 must be greater than ε ≈ 10 −10 and we do not need to compute the remaining parts with "?" mark.In the case of e 2 ≤ ε, we deflate the matrix, resulting in the case of two 2 × 2 block submatrices.Basically, we check whether the diagonal elements are greater than ε.To search for the two 2 × 2 block submatrices, we have three different cases as follows.
1) One 2 × 2 block submatrix is at the bottom-right corner, while another is 1 × 1 block submatrix (or say, 2 × 2 degenerated block submatrix with the bottom-right element being 0) at the upper-left corner.2) One 2 × 2 block submatrix is at the bottom-right corner, while another is 2 × 2 block submatrix at the upper-left corner.3) One 2 × 2 block submatrix is at the bottom-right corner, while another is 2 × 2 block submatrix at the matrix diagonal part (but not at the upper-left corner).
The above algorithm for 5 × 5 matrix can be extended to handle n × n matrix.Algorithm 3 searches for meaningful eigenpairs with the minimal number of iterations of Householder transformation.Two largest eigenpairs among three calculated eigenpairs in the case of a 3 × 3 meaningful block submatrix or among four calculated eigenpairs in the case of two 2 × 2 meaningful block submatrices, in fact, suffice to reconstruct X .

VI. EXPERIMENTAL ANALYSIS
In order to evaluate the performance of MDSClone in detecting clones, we have conducted several experimental analyses considering various network settings and evaluation criteria (i.e., detection probability, computation time, and memory and energy consumption).To study the practicality of our proposed MDSClone scheme for the current generation of sensors in an IoT environment, we have implemented a prototype of our scheme on TelosB motes running TinyOS (with the following specifications: Micro-Controller: TI MSP430F1611; ROM: 48KB + 256B; RAM: 10KB; Radio Chipset: ChipCon CC2420).We executed our algorithm using TOSSIM [37] on TinyOS 1.1.15to evaluate the energy consumption of MDSClone.Note that TOSSIM is a discrete-event simulator, designed especially for TinyOS WSNs, on which TinyOS Algorithm 3 Calculating Meaningful Eigenpairs 1: SETTING: (A i ) x,y is the element of A i at the intersection of x-th row and y-th column 2: SETTING: A [(i, j ),(x,y)] is submatrix whose upper-left (bottom-right) element is A i, j ( A x,y ) 3: INPUT: A ∈ R n×n with small values as matrix elements 4: Calculate A 1 = HOUSEHOLDER(A) and A 2 = HOUSEHOLDER(A 1 ) code can be executed directly.Owing to this feature, although TOSSIM is in essence a simulator, its estimation of energy consumption is rather accurate.In our experimental setting, we considered different network sizes varying from 1,000 to 10,000.Moreover, we considered different numbers of clone groups in the network, varying from two to 14.

A. Evaluation Results
Because each node in MDSClone only needs to sense the RSS, send out the measured distances from its neighbors, and forward the received neighbor-distance information to the BS, MDSClone in fact only incurs a limited memory overhead.The neighboring information occupies 12029 bytes in ROM, and only 602 bytes in RAM.On the other hand, because each node is assumed to only execute the above steps, some delay will be incurred when a node uses the MDSClone algorithm.Owing to the fact that a node only sends one packet per second (which is our considered setting), the reported detection time will be affected by such a setting.If we ignore the time delay incurred by our hardware setting, we can observe and infer that the computation time on a sensor node is 0.25 seconds.The results of the TOSSIM simulation show that for the MDSClone operations on the mote (i.e., finding the neighboring nodes), the energy consumption due to the use of the microcontroller is 1222 mJ, and the energy consumption due to the use of the radio circuit is 2021 mJ.It is worth noting that the computation of the MDS, which is the main function of our proposed algorithm, is basically performed by the BS, and therefore the computational overhead and energy consumption imposed on the sensor nodes are negligible.
In Figure 3 we report the result of our evaluation of the clone detection probability of MDSClone considering three network settings: (a) varying the total number of nodes in the network from 1,000 to 10,000, while assuming there are two clones in the network; (b) considering a fixed number of nodes in the network, i.e., n = 1, 000, and varying the number of clones in an ideal network setting without noise; (c) considering a fixed number of nodes, n = 1, 000, and varying the number of clones, assuming the environment to be noisy due to node mobility, as explained in Section V-A.1.c.For this experiment we adopted the value of λ that we calculated in Section V-A 1 a.In practice, the network owner may choose a distortion threshold smaller than the one calculated in Section V-A 1 a, in order to ensure the successful detection of nearly all clones.Indeed, the choice of a small λ may lead to false positives, i.e., some genuine nodes may be regarded as clones because of distortion due to noisy distance measurements or using the shortest path to approximate the Euclidean distance between two nodes in the MDS calculation.Because the BS may perform attestation on clones instead of having a network-wide revocation of clone IDs, the BS may find that a clone under attestation is genuine node.In this manner, the BS can still almost perfectly detect the clones at the expense of rare false positives.
As we can see in Figure 3, the clone detection probability of MDSClone is 100% in various scenarios.In particular, Figure 3a shows that MDSClone has a full detection probability for both large and small scale networks.Figure 3b depicts the probability of clone detection when varying the number of clone groups.We recall that, as explained in Section III-B, in each clone group we considered two clones, for simplicity.In this evaluation, we considered a network of size n = 1, 000, and we varied the number of clones from two to 14.The reason behind this choice of range is that if we are able to detect the clones when there number in the network is few, it would be much more easier to distinguish them when there are a large number of clones in the network.This follows because from the MDS point of view, more clone groups actually imply more distortion on the reconstructed node map, and therefore they are easier to detect for the BS.Moreover, Figure 3c shows that MDSClone is robust to noisy distance measurements.In particular, as can be seen in the figure, even in the case of N (2, 10) noise applied to each distance measurement (a Gaussian distribution with mean two and standard deviation two), MDSClone is still able to reconstruct the node map with a slight increase in the approximate preservation distance λ.The approximate preservation of the node map can therefore be used to identify the clones.
It is worth mentioning that some extreme rare cases might still affect the clone detection probability of MDSClone.For example, consider the case in which clones are the same distance from most of the other nodes in the network.For instance, let A and C represent two genuine nodes forming a connection, while A, B, C, and B form a rectangle, where B and B are clones sharing the same ID and have the same distance to A and C. If the shape formed by the genuine nodes is symmetric (e.g., the line formed by A and C in the above example), then the attacker can strategically place the two clones (e.g., B and B in the above example) in such a way that the distances between the genuine nodes and clones are all preserved.In this case, MDSClone fails to detect the clones.In general, the above argument can be extended to the case where the shape formed by genuine nodes satisfies rotational symmetry.If so, an attacker can place a particular number of clones in the network such that the distances between genuine nodes and clones are all preserved, in order to evade detection by MDSClone.However, we argue that such cases occur very rarely in hybrid networks.Considering that (some) nodes in hybrid networks have mobility, the shape formed by genuine nodes is constantly changing.At any time slot, the underlying network topology looks like a random geometric graph, and the probability that the shape is symmetric or rotationally symmetric is very low.
Another criterion that we considered in evaluating the performance of MDSClone is the computational time.As is evident in Algorithm 1, the main computational complexity of MDSClone relates to the computing of the MDS function by the BS, and the other operations contribute negligible computational overheads to MDSClone.Therefore, here we just concentrate on the MDS calculation time, and emphasize how our proposed acceleration techniques help to reduce the computational time.More specifically, in Section V-B 1 we proposed three techniques to speed up the MDS calculation.In Figure 4a and Figure 4b we compare the computational times for our proposals and the conventional MDS calculation.In particular, in Figure 4a we implemented the TI method (explained in Section V-B 1 b) to considerably speed up the calculation of B = C, which constitutes a significant portion of the computational burden in the MDS calculation.Figure 4a shows that our developed TI technique achieves a speed-up by a factor of five compared with the original MDS for a matrix of dimension 10000 × 10000, simulated in MATLAB and adopting the built-in function mdscale".The speed-up gain will be significantly higher when considering a larger matrix.In addition, although the speed-up applies only to the MDS calculation, in fact MDSClone also achieves a speed-up by a factor of five, because MDS constitutes the core part of MDSClone.
As can be seen in Figure 4a, the TI method calculates B = C faster than the original MDS algorithm.However, one may have concerns regarding the overall performance improvement of the MDS calculation.Figure 4b shows that by using our developed speed-up techniques (CIPMLO, TI, and SMEBM), we can enhance the speed by more than five times compared with the conventional MDS calculation (e.g., the combined use of a householder transform and eigendecomposition).The reason for the performance discrepancy between Figure 4a and Figure 4b can be attributed to the fact that the speed-up shown in Figure 4b is the result of adopting CIPMLO, TI, and SMEBM, while the speed-up shown in Figure 4a is the result of merely using TI.Another important observation from Figure 4b is that considering a network size of 1,000 nodes, our MDS calculation requires around two seconds, and in a very large-scale scenario with 10,000 nodes, it takes less than 10 seconds.It is worth mentioning that in an IoT environment, considering 1,000 nodes in a network is more realistic, and reflects several real-world applications, such as a smart hospital or smart building.At this stage, one may have concerns regarding the total detection time of MDSClone, because Figure 4b shows only one computation of the MDS function, while in MDSClone the BS needs to iteratively perform the MDS calculation in order to identify the clones in the network.It should be noted that because the MDS calculation in each iteration should be performed on a different set of nodes in the network, each iteration is independent from the others.Thanks to this prominent feature, different iterations of MDS calculation could easily be performed in parallel by several different devices, or powerful servers, leading to a reduction in the computational time of the whole algorithm.

B. Comparative Analysis
In this section, we provide a comparative analysis between MDSClone and the state-of-the-art clone detection methods.For our comparison, we consider the following performance metrics: (i) The amount of time required for clone detection.(ii) The communication overhead at an IoT device.(iii) The memory overhead at an IoT device.We exclude XED [10] and EDD [10], [11] from our comparison, because XED is vulnerable to collusive clones and EDD only works when considering a random waypoint mobility model.
In Figure 5a, we present the required time for clone detection.As can be seen, TDD [12] and MDSClone require considerably less time compared with the other detection methods.Note that TDD and MDSClone are centralized solutions, and therefore each node needs to forward some information (materials for clone detection) to a specific location in the network (e.g., the BS in the case of MDSClone).Because the average hop distance between two nodes in a flat network with n randomly distributed nodes is O( √ n), with the assumption that each hop distance movement requires one time unit, O( √ n) time delays will also be considered in TDD and MDSClone.Because each node in distributed protocols such as SDD-LC [12], SDD-LWC [12], and HIP-HOP [5] requires more time to identify witness nodes, distributed detection methods usually incur significantly more delays for clone detection.
Figure 5b shows the communication overhead imposed on each node.As shown in the figure, TDD and MDSClone impose higher communication overheads.This is because of the fact that, similarly to the above case, in these two methods each node needs to forward the required information to a central node.On the other hand, in the case of a distributed solution, each node only needs to communicate with its neighbors.However, it is worth noting that although the communication overhead in MDSClone is seemingly higher than that in distributed solutions, it can be substantially reduced in certain applications.For example, in certain IoT applications (e.g., a smart city) there could be multiple relays capable of cellular communication (e.g., LTE) forming a backbone network with the BS.An IoT device in the proximity of a relay node can forward the neighbor-distance information to the nearby relay, which then forwards the neighbor-distance information back to the BS.In this manner, not only the time, but also the communication cost, can be significantly reduced, while still guaranteeing the detection capability of MDSClone.It is worth noting that the effect of such a scenario on other existing clone detection methods remains unclear.
In addition, we conducted experiments regarding the storage overhead imposed on the IoT nodes.The comparison results are shown in Figure 5c.The storage overhead of MDSClone is close to zero.The reason for this is that in MDSClone each node simply collects neighbor-distance information from neighboring nodes and forwards it to the BS.After forwarding, the node can remove this information from its memory, resulting in a memory footprint of close to zero.However, all of the other detection methods require the IoT device to maintain the historic neighbor-distance information for some time to identify clones, resulting in a considerable memory overhead.

VII. CONCLUSION
In this paper, we have proposed a clone detection solution, called MDSClone, based on the multidimensional scaling (MDS) algorithm for a heterogeneous IoT environment.We have taken into account the specific features of IoT devices in designing MDSClone, i.e., unawareness of geographical positions, the possibility of being both static and mobile, and the lack of a specific mobility pattern.We showed (in Table I) that compared with the existing clone detection methods, MDSClone provides an outstanding approach, because it is the first method that supports hybrid networks, while its memory cost is of order O(1), its communication cost is affordable, and it is a location-independent method.Moreover, we showed that the clone detection probability of MDSClone is almost 100%, and the MDS calculation algorithm could be parallelized, leading to a shorter detection delay.Therefore, considering all of its advantages, we believe that MDSClone could be considered as a superior candidate for clone detection in realworld IoT scenarios.However, in the case of dense network topologies, our proposal may impose a communication overhead on the network.Therefore, in future work we aim to provide a distributed version of MDSClone for IoT scenarios.

Fig. 1 .
Fig. 1.An example of the MDS-MAP procedure.(a) Nodes send their pairwise distances to the BS.(b) BS uses MDS's output to rebuild a map.

Fig. 2 .
Fig. 2.An example IoT network with node B as a clone (we named the clone nodes as B and B' for clarification).(a) Two nodes with the same ID (nodes B and B').(b) Distorted reconstructed node map.
MDS calculations, where O(n i ) comes with n i possibilities of one clone ID (where i = 1 . . .c).

4 .
Analysis of the MDS speed-up proposals.(a) Computational time of B = C vs. number of nodes.(b) MDS calculations time vs. number of nodes.

Fig. 5 .
Fig. 5. Performance comparison between MDSClone and state-of-the-art clone detection methods.(a) Clone detection time.(b) Communication overhead of a node.(c) Memory overhead of a node.