Please download to get full document.

View again

of 9

Robust Graph Neural Network Against Poisoning At-tacks via Transfer Learning

Graph neural networks (GNNs) are widely used in many applications. However, their robustness against adversarial aaacks are criticized. Prior studies shows that using unnoticeable modiica-tions on graph topology or nodal features can signiicantly
0 views9 pages
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Documenttranscript
  Robust Graph Neural Network Against Poisoning Aacks viaTransfer Learning Xianfeng Tang † , Yandong Li § , Yiwei Sun † , Huaxiu Yao † , Prasenjit Mitra † , Suhang Wang † ∗ Pennsylvania State University † University of Central Florida § { tangxianfeng, lyndon.leeseu } @outlook.com  { yus162, huy144, pum10 ,szw494 } @psu.edu ABSTRACT Graph neural networks (GNNs) are widely used in many applica-tions. However, their robustness against adversarial aacks arecriticized. Prior studies shows that using unnoticeable modifica- tions on graph topology or nodal features can significantly reducethe performances of GNNs. It is very challenging to design robustgraph neural networks against poisoning aack and several efforts have been taken. Existing work aims at reducing the negative im- pact from adversarial edges only with the poisoned graph, which is sub-optimal since they fail to discriminate adversarial edges from normal ones. On the other hand, clean graphs from similar domainsas the target poisoned graph are usually available in real world. By perturbing these clean graphs, we create supervised knowledge to train the ability to detect adversarial edges so that the robustnessof GNNs is elevated. However, such potential for clean graphs is neglectedbyexistingwork. Tothisend, weinvestigateanovelprob-lem of improving the robustness of GNNs against poisoning aacksby exploring clean graphs. Specifically, we propose PA-GNN, which relies on a penalized aggregation mechanism that directly restrict the negative impact of adversarial edges by assigning them loweraention coefficients. To optimize PA-GNN for a poisoned graph,we design a meta-optimization algorithm that trains PA-GNN topenalize perturbations using clean graphs and their adversarial counterparts, and transfers such ability to improve the robustness of PA-GNN on the poisoned graph. Experimental results on four real-world datasets demonstrate the robustness of PA-GNN against poisoning aacks on graphs. ACM Reference format: Xianfeng Tang † , Yandong Li § , Yiwei Sun † , Huaxiu Yao † , Prasenjit Mitra † ,Suhang Wang † . 2016. Robust Graph Neural Network Against Poisoning At-tacks via Transfer Learning. In  Proceedings of ACM Conference, Washington, DC, USA, July 2017 (Conference’17),  9 pages.DOI: 10.1145/nnnnnnn.nnnnnnn 1 INTRODUCTION Graph neural networks (GNNs) [ 8 ,  13 ,  18 ], which explore the powerofneuralnetworksforgraphdata, haveachievedremarkableresults ∗ Corresponding Author. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permied. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. Conference’17, Washington, DC, USA © 2016 ACM. 978-x-xxxx-xxxx-x/YY/MM...$15.00DOI: 10.1145/nnnnnnn.nnnnnnn in various applications such as social recommendation [ 9 ] andnatural language processing [ 16 ,  40 ]. e key to the success of GNNs is its signal-passing process [ 34 ], where information from neighbors is aggregated for every node in each layer. e collectedinformation enriches node representations, preserving both nodal feature characteristics and topological structure. oughGNNsareeffectiveformodelinggraphdata, thewaythat GNNs aggregate neighbor nodes’ information for representation learning makes them vulnerable to adversarial aacks [ 7 ,  35 ,  37 ,  41 , 42 ]. Poisoning aack on a graph [ 41 ], which adds/deletes carefully chosen edges to the graph topology or injects carefully designed perturbationstonodalfeatures,cancontaminatetheneighborhoods of nodes, bring noises/errors to node representations, and degrade the performances of GNNs significantly. e lack of robustness become a critical issue of GNNs in many applications such as finan- cial system and risk management [ 1 ]. For example, fake accountscreated by a hacker can add friends with normal users on socialnetworks to promote their scores predicted by a GNN model. A model that’s not robust enough to resist such “cheap” aacks could lead to serious consequences. Hence, it is important to developrobust GNNs against adversarial aacks. Recent studies of adver- sarial aacks on GNNs suggest that adding perturbed edges is moreeffective than deleting edges or adding noises to node features [ 35 ]. is is because node features are usually high-dimensional, requir- ing larger budgets to aack. Deleting edges only result in the loss of some information while adding edges is cheap to contaminate information passing dramatically. For example, adding a few bridge edges connecting two communities can affect the latent represen- tations of many nodes. us, we focus on  defense against the more  effective poisoning aacks that a training graph is poisoned with  injected adversarial edges  . To defend against the injected adversarial edges, a natural idea is to delete these adversarial edges or reduce their negative impacts. Several efforts have been made in this direction [ 17 ,  35 ,  39 ]. For example, Wuetal . [ 35 ]utilizeJaccardsimilarityoffeaturestoprune perturbed graphs with the assumption that connected nodes have highfeaturesimilarity. RGCNin[ 39 ]introduceGaussianconstrains on model parameters to absorb the effects of adversarial changes.e aforementioned models only rely on the poisoned graph fortraining, leading to sub-optimal solutions. e lack of supervised information about real perturbations in a poisoned graph obstructs models from modeling the distribution of adversarial edges. ere- fore, exploring alternative supervision for learning the ability to reduce the negative effects of adversarial edges is promising. ere usually exist clean graphs with similar topological distri- butions and aribute features to the poisoned graph. For example, Yelp and Foursquare have similar co-review networks where the   a  r   X   i  v  :   1   9   0   8 .   0   7   5   5   8  v   1   [  c  s .   L   G   ]   2   0   A  u  g   2   0   1   9  Conference’17, July 2017, Washington, DC, USA X. Tang et al. nodes are restaurants and two restaurants are linked if the num-ber of co-reviewers exceeds a threshold. Facebook and Twiercan be treated as social networks that share similar domains. It isnot difficult to acquire similar graphs for the targeted perturbedone. As shown in existing work [ 19 ,  29 ], because of the similarityof topological and aribute features, we can transfer knowledgefrom source graphs to target ones so that the performance on tar-get graphs is elevated. Similarly, we can inject adversarial edgesto clean graphs as supervisions for training robust GNNs, whichare able to penalize adversarial edges. Such ability can be furthertransferred to improve the robustness of GNNs on the poisoned graph. Leveraging clean graphs to build robust GNNs is a promis- ing direction. However, prior studies in this direction are rather limited. erefore, in this paper, we investigate a novel problem of ex- ploring clean graphs for improving the robustness of GNNs againstpoisoning aacks. e basic idea is first learning to discriminate ad- versarial edges, thereby reducing their negative effects, then trans- ferring such ability to a GNN on the poisoned graph. In essence, we are faced with two challenges: ( i ) how to mathematically utilize clean graphs to equip GNNs with the ability of reducing negativeimpacts of adversarial edges; and ( ii ) how to effectively transfersuch ability learned on clean graphs to improve the robustnessof GNNs on a poisoned graph. In an aempt to solve these chal- lenges, we propose a novel framework  P enalized  A ggregation  GNN (PA-GNN). Firstly, clean graphs are aacked by adding adversarial edges, which serve as supervisions of known perturbations. With these known adversarial edges, a penalized aggregation mechanism is then designed to learn the ability of alleviating negative influ-ence from perturbations. We further transfer this negative effect alleviation ability to the target poisoned graph with a special meta- optimization approach, so that the robustness of GNNs is elevated. To the best of our knowledge, we are the first one to propose a GNN that can directly penalize perturbations and to leverage transferlearning for enhancing the robustness of GNN models. e main contributions of this paper are: •  We study a new problem and propose a principle approach of ex-ploringcleangraphsforlearningarobustGNNagainstpoisoning aacks on a target graph; •  WeprovideanovelframeworkPA-GNN,whichisabletoalleviate the negative effects of adversarial edges with carefully designed penalized aggregation mechanism, and transfer the alleviation ability to the target poisoned graph with meta-optimization; •  We conduct extensive experiments on real-world datasets todemonstrate the effectiveness of PA-GNN against various poi- soning aacks and to understand its behaviors. e rest of the paper is organized as follows. We review related work in Section 2. We define our problems in Section 3. We intro- ducethe details of PA-GNNin Section4. Extensive experiments and their results are illustrated and analyzed in Section 5. We conclude the paper in Section 6. 2 RELATED WORK In this section, we briefly review related works, including graph neural networks, adversarial aack and defense on graphs. 2.1 Graph Neural Networks Ingeneral,graphneuralnetworksrefertoalldeeplearningmethods for graph data [ 36 ]. It can be generally categorized into two cate- gories, i.e., spectral-based and spatial-based. Spectral-based GNNsdefine “convolution” following spectral graph theory [ 3 ]. e first generation of GCNs are developed by Bruna et al .  [ 3 ] using spectral graph theory. Various spectral-based GCNs are developed later on [ 8 ,  14 ,  18 ,  20 ]. To improve efficiency, spatial-based GNNs are pro- posed to overcome this issue [ 11 ,  13 ,  23 ,  25 ]. Because spatial-based GNNs directly aggregate neighbor nodes as the convolution, andare trained on mini-batches, they are more scalable than spectral- based ones. Recently, Veli ˇ ckovi ´ c et al .  [ 32 ] propose graph aention network (GAT) that leverages self-aention of neighbor nodes for the aggregation process. e major idea of GATs [ 38 ] is focusingon most important neighbors and assign higher weights to them during the information passing. However,  existing GNNs aggregates neighbors’ information for representation learning, making them vul-  nerable to adversarial aacks, especially perturbed edges added to  the graph topology.  Next, we review adversarial aack and defense methods on graphs. 2.2 Adversarial Attack and Defense on Graphs Neural networks are widely criticized due to the lack of robustness [ 5 ,  6 ,  12 ,  21 ], and the same to GNNs. Various adversarial aackmethods have been designed, showing the vulnerability of GNNs [ 2 ,  4 ,  7 ]. ere are two major categories of adversarial aack meth- ods, namely evasion aack and poisoning aack. Evasion aackfocuses on generating fake samples for a trained model. Dai et al .[7] introduce an evasion aack algorithm based on reinforcement learning. On the contrary, poisoning aack changes training data, which can decrease the performance of GNNs significantly. Forexample, Z ¨ ugner et al .  [ 41 ] propose  neack   which make GNNsfail on any selected node by modifying its neighbor connections. ey further develop  metaack   [ 42 ] that reduces the overall perfor- mance of GNNs. Comparing with evasion aack, poisoning aack methods are usually stronger and can lead to an extremely lowperformance [ 39 ,  41 ], because of its destruction of training data. Besides, it is almost impossible to clean up a graph which is already poisoned. erefore, we focus on defending the poisoning aack of graph data in this paper. How to improve the robustness of GNNs against adversarial poising aacks is aracting increasing aention and initial effortshave been taken [ 17 ,  35 ,  37 ,  39 ]. For example, Wu et al .  [ 35 ] utilize the Jaccard similarity of features to prune perturbed graphs withthe assumption that connected nodes should have high feature similarity. RGCN in [ 39 ] adopts Gaussian distributions as the noderepresentations in each convolutional layer to absorb the effects of adversarial changes in the variances of the Gaussian distributions. e basic idea of aforementioned robust GNNs against poisoningaack is to alleviate the negative effects of the perturbed edges.However, perturbed edges are treated equally as normal edges during aggregation in existing robust GNNs. e proposed PA-GNN is inherently different from existingworks: ( i ) instead of purely trained on the poisoned target graph, adopting clean graphs with similar domains to learn the ability of   Robust Graph Neural Network Against Poisoning Aacks via Transfer Learning Conference’17, July 2017, Washington, DC, USA alleviating negative effects of adversarial edges; and ( ii ) investigat- ing meta-learning to transfer such ability to the target poisoned graph for improving the robustness. 3 PRELIMINARIES3.1 Notations We use  G  =  (V  , E , X ) to denote a graph, where V   =  {   1 , . . . ,   N  } is the set of   N   nodes,  E ⊆ V ×V   represents the set of edges, and X  =  { x 1 , . . . , x N  }  indicates node features. In a semi-supervised seing,partialnodescomewithlabelsandaredefinedas V  L ,where the corresponding label for node    is denoted by     . Note that thetopology structure of   G  is damaged, and the srcinal clean version is unknown. In addition to the poisoned graph  G , we assume thereexists M   clean graphs sharing similar domains with G . For example,when  G  is the citation network of publications in data mining field, a similar graph can be another citation network from physics. Weuse  { G 1 , . . . , G M  }  to represent clean graphs. Similarly, each clean graph consists of nodes and edges. We use V Li   to denote the labeled nodes in graph G i  . 3.2 Basic GNN Design We introduce the general architecture of a graph neural network. A graph neural network contains multiple layers. Each layer trans-forms its input node features to another Euclidean space as output. Different from fully-connected layers, a GNN layer takes first-order neighbors’ information into consideration when transforming the feature vector of a node. is “message-passing” mechanism en-sures the initial features of any two nodes can affect each other even if they are faraway neighbors, along with the network going deeper. e input node features to the l  -th layer in an  L -layer GNNcan be represented by a set of vectors H l  =  { h l  1 , . . . , h l N  } , h l i   ∈  R d  l  , where h l i   corresponds to   i  . Obviously, H 1 =  X . e output node features of the l  -th layer, which also formulate the input to the next layer, are generated as follows: h l  + 1 i   =  Update  h l i  , Agg ( h l  j  |  j   ∈ N  i  )   (1) where  N  i   is the set of first-order neighbors of node  i  ,  Agg (·)  in-dicates a generic aggregation function on neighbor nodes, and Update (·)  is an update function that generates a new node repre-sentation vector from the previous one and messages from neigh-bors. Most graph neural networks follow the above definition.For example, Hamilton et al .  [ 13 ] introduce mean, pooling andLSTM as the aggregation function, Veli ˇ ckovi ´ c et al .  [ 32 ] leverage self-aention mechanism to update node representations. A GNN can be represented by a parameterized function  f   θ   where  θ   repre- sents parameters, the loss function can be represented as  L c  ( θ  ) . In semi-supervised learning, the cross-entropy loss function for node classification takes the form: L c  ( θ  )  =  −    ∈V  L      log ˆ     ,  (2) where  ˆ      is the predicted label generated by passing the output from the final GNN layer to a somax function. Figure 1: Overall framework of PA-GNN. icker arrows in-dicate higher attention coefficients.  θ  ∗ denotes the modelinitialization from meta-optimization. 3.3 Problem Definition With the aforementioned notations and definitions, the problem of  exploringcleangraphsforlearningarobustGNNagainstpoisoning aacks on a target graph is formally defined as: Given the target graph   G  that is poisoned with adversarial edges, a  set of clean graphs   { G 1 , . . . , G M  }  from similar domain as   G , and the   partially labeled nodes of each graph (i.e.,  { V L 1 , . . . , V LM  ; V  L } ), we  aim at learning a robust GNN to predict the unlabeled nodes of    G . Itisworthnotingthat,inthispaper,welearnarobustGNNforsemi- supervised node classification. e proposed PA-GNN is a generalframework for learning robust GNN of various graph mining tasks such as link prediction. 4 PROPOSED FRAMEWORK In this section, we give the details of PA-GNN. An illustration of theframeworkisshowninFigure1. Firstly,cleangraphs { G 1 , . . . , G M  } are introduced to generate perturbed edges. e generated per-turbations then serve as supervised knowledge to train a modelinitialization for PA-GNN using meta-optimization. Finally, wefine-tune the initialization on the target poisoned graph for thebest performance. anks to the meta-optimization, the ability to reduce negative effects of adversarial aack is retained aer adapt- ing to  G . In the following sections, we introduce technical details of PA-GNN. 4.1 Penalized Aggregation Mechanism We begin by analyzing the reason why GNNs are vulnerable to adversarial aacks with the general definition of GNNs in Equation 1. Suppose the graph data fed into a GNN is perturbed, the aggre-gation function  Agg (·)  treats “fake” neighbors equally as normalones, and propagates their information to update other nodes. Asa result, GNNs fail to generate desired outputs under influence of adversarial aacks. Consequently, if messages passing through perturbed edges are filtered, the aggregation function will focus on “true” neighbors. In an ideal condition, GNNs can work well if all perturbed edges produced by aackers are ignored. Motivated by above analysis, we design a novel GNN with pe-nalized aggregation mechanism (PA-GNN) which automaticallyrestrict the message-passing through perturbed edge. Firstly, we adoptsimilarimplementationfrom[ 31 ]anddefinetheself-aention  Conference’17, July 2017, Washington, DC, USA X. Tang et al. coefficient  a l ij   for node features of    i   and    j   on the l  -the layer using a non-linear function:a l ij   =  LeakyReLU  ( a l  ) ⊤ [ W l  h l i   ⊕ W l  h l  j  ]  ,  (3) where  a l  and  W l  are parameters,  ⊤  represents the transposition, and  ⊕  indicates the concatenation of vectors. Note that coefficientsareonlydefinedforfirst-orderneighbors. Take   i   asanexample,weonly compute  a l ij   for  j   ∈ N  i  , which is the set of direct neighbors of    i  . e aention coefficients related to   i   are further normalized among all nodes in  N  i   for comparable scores: α  l ij   = exp  a l ij   k  ∈N  i   exp  a l ik   .  (4) We use normalized aention coefficient scores to generate a lin-ear combination of their corresponding node features. e linear combination process serves as the aggregating process, and its re- sults are utilized to update node features. More concretely, a graph neural network layer is constructed as follows: h l  + 1 i   = σ    j  ∈N   j  α  l ij  W l  h l  j   .  (5) A similar definition can be found in [ 32 ]. Clearly, the above design of GNN layer cannot discriminate perturbed edges, let alone alle-viate their negative effects on the “message-passing” mechanism,because there is no supervision to teach it how to honor normaledges and punish perturbed ones. A natural solution to this prob- lem is reducing the aention coefficients for all perturbed edges ina poisoned graph. Noticing the exponential rectifier in Equation 4,a lower aention coefficient only allows lile information passingthrough its corresponding edge, which mitigate negative effects if  the edge is an adversarial one. Moreover, since normalized aen- tion coefficient scores of one node always sum up to 1, reducing the aention coefficient for perturbed edges will also introduce more aention to clean neighbors. To measure the aention coefficients received by perturbed edges, we propose the following metric: S  p   = L  l  = 1  e  ij  ∈P a l ij  ,  (6) where L  is the total number of layers in the network, and P  denotestheperturbededges. Generally, asmaller S  p   indicateslessaention coefficients received by adversarial edges. To further train GNNssuch that a lower  S  p   is guaranteed, we design the following loss function to penalize perturbed edges: L dist   =  − min  η ,  E e  ij  ∈E\P 1 ≤ l  ≤ L a l ij   −  E e  ij  ∈P 1 ≤ l  ≤ L a l ij   ,  (7) where η  is a hyper parameter controlling the margin between mean values of two distributions,  E\P  represents normal edges in thegraph, and  E  computes the expectation. Using the expectation of  aention coefficients for all normal edges as an anchor,  L dist   aims at reducing the averaged aention coefficient of perturbed edges, until a certain discrepancy of   η  between these two mean values is satisfied. Note that minimizing  S  p   directly instead of   L dist   willlead to unstable aention coefficients, making PA-GNN hard to converge. e expectations of aention coefficients are estimated by their empirical means: E e  ij  ∈E\P 1 ≤ l  ≤ L a l ij   = 1 L |E\P| L  l  = 1  e  ij  ∈E\P a l ij  ,  (8) E e  ij  ∈P 1 ≤ l  ≤ L a l ij   = 1 L |P| L  l  = 1  e  ij  ∈P a l ij  ,  (9) where  | · |  denotes the cardinality of a set. We combine  L dist   with the srcinal cross-entropy loss L c   and create the following learning objective for PA-GNN: min θ  L  =  min θ  (L c   + λ  L dist  ) ,  (10) where  λ   balances the semi-supervised classification loss and the aention coefficient scores on perturbed edges. TrainingPA-GNNwiththeaboveobjectivedirectlyisnon-trivial, because it is unlikely to distinguish exact perturbed edges  P  from normal edges in a poisoned graph. However, it is practical to dis-cover vulnerable edges from clean graphs with adversarial aack methods on graphs. For example,  metaack   poisons a clean graph to reduces the performance of GNNs by adding adversarial edges,which can be treated as the set  P . erefore, we explore clean graphs from domains similar to the poisoned graph. Specifically, asshowninFigure1, wefirstinject perturbationedgestocleangraphs using adversarial aack methods, then leverage those adversarial counterparts to train the ability to penalize perturbed edges. Such ability is further transferred to GNNs on the target graph, so thatthe robustness is improved. In the following section, we discuss how we transfer the ability to penalize perturbed edges from clean graphs to the target poisoned graph in detail. 4.2 Transfer with Meta-Optimization As discussed above, it is very challenging to train PA-GNN for a poisoned graph because the adversarial edge distribution remains unknown. We turn to exploit clean graphs from similar domains tocreate adversarial counterparts that serve as supervised knowledge. One simple solution to utilize them is pre-training PA-GNN on clean graphs with perturbations, which formulate the set of adver-sarial edges  P . en the pre-trained model is fine-tuned on target graph  G  purely with the node classification objective. However, the performance of pre-training with clean graphs and adversarialedges is rather limited, because graphs have different data distribu- tions, making it difficult to equip GNNs with a generalized ability to discriminate perturbations. Our experimental results in Section 5.3 also confirm the above analysis. In recent years, meta-learning has shown promising results in various applications [ 24 ,  27 ,  30 ,  33 ]. e goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new tasks with a small amount or even no supervision knowledge [ 10 ,  15 ]. Finn et al .  [ 10 ] propose model-agnostic meta-learningalgorithm where the model is trained explicitly such that a small number of gradient steps and few training data from a new task can also produce good generalization performance on that task. ismotivates us to train a meta model with a generalized ability topenalize perturbed edges (i.e., assign lower aention coefficients). e meta model serve as the initialization of PA-GNN, and its fast- adaptationcapabilityhelpsretainsuchpenalizingabilityasmuchas  Robust Graph Neural Network Against Poisoning Aacks via Transfer Learning Conference’17, July 2017, Washington, DC, USA possible on the target poisoned graph. To achieve the goal, we pro-pose a meta-optimization algorithm that trains the initialization of PA-GNN. With manually generated perturbations on clean graphs,PA-GNN receive full supervision and its initialization preserve the penalizing ability. Further fine-tuned model on the poisoned graph G  is able to defend adversarial aacks and maintain an excellent performance. We begin with generating perturbations on clean graphs. State- of-the-art adversarial aack method for graph –  metaack   [ 42 ]is chosen. Let  P i   represent the set of adversarial edges createdfor clean graph  G i  . Next, we define learning tasks for the meta- optimization. e learning objective of any task is defined in Equa- tion 10, which aims at classifying nodes accurately while assigning low aention coefficient scores to perturbed edges on its corre-sponding graph. Let  T  i   denote the specific task for  G i  . Namely,there are  M   tasks in accordance with clean graphs. Because cleangraphs are specified for every task, we use  L T  i  ( θ  )  to denote theloss function of task  T  i  . We then compile support sets and query sets for learning tasks. Labeled nodes from each clean graph is splitinto two groups – one for the support set and the other as the query set. Let S i   and Q i   denote the support set and the query set for G i  , respectively. Given  M   learning tasks, the optimization algorithm first adaptsthe initial model parameters to every learning task separately. For-mally,  θ   becomes  θ  ′ i   when adapting to  T  i  . We use gradient descent to compute the updated model parameter  θ  ′ i  . e gradient w.r.t  θ  ′ i  is evaluated using  L T  i  ( θ  ) on corresponding support set S i  , and the initial model parameters  θ   are updated as follows: θ  ′ i   = θ   − α  ∇ θ  L T  i  ( θ  ) ,  (11) where  α   controls the learning rate. Note that only one gradient step is shown in Equation 11, but using multiple gradient updates is a straightforward extension, as suggested by [ 10 ]. ere are  M  differentversionsoftheinitialmodel(i.e.,  f   θ  ′ i  , ···  ,  f   θ  ′ M  )constructed in accordance with learning tasks. e model parameters are trained by optimizing for the perfor- mance of   f   θ  ′ i  with respect to θ   across all tasks. More concretely, we define the following objective function for the meta-optimization: min θ M   i  = 1 L T  i  ( θ  ′ i  )  =  min θ M   i  = 1 L T  i  ( θ   − α  ∇ θ   L T  i  ( θ  )) .  (12) Because both classifying nodes and penalizing adversarial edges are considered by the objective of PA-GNN, model parameters willpreserve the ability to reduce the negative effects from adversarial aacks while maintaining a high accuracy for the classification. Note that we perform meta-optimization over  θ   with the objective computed using the updated model parameters θ  ′ i   for all tasks. Con- sequently, model parameters are optimized such that few numbers of gradient steps on a new task will produce maximally effective behavior on that task. e characteristic of fast-adaptation on newtasks would help the model retain the ability to penalize perturbed edges on  G , which is proved by the experiential results in Section 5.3.1. Formally, stochastic gradient descent (SGD) is used to update model parameters  θ   cross tasks: θ   ← θ   −  β  ∇ θ M   i  = 1 L T  i  ( θ  ′ i  ) .  (13) In practice, the above gradients are estimated using labeled nodes from query sets S i   of all tasks. Our empirical results suggest that spliing support sets and query sets on-the-fly through iterations of the meta-optimization improves overall performance. We adopt this strategy for the training procedure of PA-GNN. TrainingAlgorithm AnoverviewofthetrainingprocedureofPA-GNN is illustrated in Algorithm 1. Adversarial edges are injected to clean graphs using  metaack   (Line 2 to 4). Support sets and querysets are randomly split on-the-fly for each task in Line 7. We then adapt initial model parameter  θ   to  θ  ′ i   for each task  T  i  . e modelparameters are updated by optimizing for the performance of all  f   θ  ′ i  w.r.t  θ   using labeled nodes from query sets (Line 11). Finally, we adapt the trained model initialization to the targeted poisoned graph by minimizing the classification loss  L c   on  G . Algorithm 1:  e training framework of PA-GNN Input:  G  and  { G 1 , . . . , G M  } Output:  Model parameters  θ  1  Randomly initialize  θ  ; 2  for G i   =  G 1 , . . . , G M   do 3  Select perturbed edge set  P i   with  metaack  ; 4  end 5  while  not early-stop   do 6  for G i   =  G 1 , . . . , G M   do 7  Split labeled nodes of  G i   into support set S i   and Q i  ; 8  Evaluating  ∇ θ  L T  i  ( θ  )  with S i   and  L T  i  ; 9  Compute adapted parameters  θ  ′ i   with gradient descent: θ  ′ i   ← θ   − α  ∇ θ  L T  i  ( θ  ) ; 10  end 11  Update  θ   on  { Q 1 , . . . , Q M  }  with: θ   ← θ   −  β  ∇ θ   M i  = 1 L T  i  ( θ  ′ i  ) ; 12  end 13  Fine-tune  θ   on  G  use  L c  ; 5 EXPERIMENTS Inthissection,weconductexperimentstoevaluatetheeffectiveness of PA-GNN. We aim to answer the following questions: •  Can PA-GNN outperform existing robust GNNs under represen- tative and state-of-the-art adversarial aacks on graphs? •  Howthepenalizedaggregationmechanismandthemeta-optimization algorithm contribute to PA-GNN? •  How sensitive of PA-GNN on the hyper-parameters? Next, we start by introducing the experimental seings followed by experiments on node classification to answer these questions. 5.1 Experimental Setup 5.1.1 Datasets.  To conduct comprehensive studies of PA-GNN, we conduct experiments under two different seings: •  Same-domain seing  : We sample the poisoned graph and cleangraphs from the same data distribution. Two popular bench-mark networks (i.e.,  Pubmed   [ 28 ] and  Reddit   [ 13 ]) are selectedas large graphs.  Pubmed   is a citation network where nodes aredocuments and edges represent citations;  Reddit   is compiled
Advertisement
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x