Libxml2的學習—生成、解析xml檔案

阿新 • • 發佈：2019-02-05

由於最近的做的一個小專案中涉及到資料的傳輸，因為xml非常適合全球資訊網傳輸，提供統一的方法來描述和交換獨立於應用程式或供應商的結構化資料。為了保證資料傳輸的效率和正確性我們選擇了使用xml格式來進行檔案的傳輸，所以其中存在xml檔案的生成與解析，下來後就看了一些關與libxml2的知識，再次總結一下。（有不足之處，還請多多指教）

1、Libxml2是個C語言的XML程式庫，能簡單方便的提供對XML檔案的各種操作，並且支援XPATH查詢，及部分的支援XSLT轉換等功能。

最簡單的安裝方法

#sudo apt-get install libxml2
#sudo apt-get install libxml2-dev

2、xml主要的資料型別

(1)內部字元型別xmlChar

xmlChar是Libxml2中的字元型別，庫中所有字元、字串都是基於這個資料型別。事實上它的定義是：xmlstring.h

typedef unsigned char xmlChar;

使用unsigned char作為內部字元格式是考慮到它能很好適應UTF-8編碼，而UTF-8編碼正是libxml2的內部編碼，其它格式的編碼要轉換為這個編碼才能在libxml2中使用。還經常可以看到使用xmlChar*作為字串型別，很多函式會返回一個動態分配記憶體的xmlChar*變數，使用這樣的函式時記得要手動刪除記憶體。

(2)xmlChar相關函式

如同標準C中的char型別一樣，xmlChar也有動態記憶體分配、字串操作等相關函式。例如xmlMalloc是動態分配記憶體的函式，xmlFree是配套的釋放記憶體函式，xmlStrcmp是字串比較函式等。基本上xmlChar字串相關函式都在xmlstring.h中定義，而動態記憶體分配函式在xmlmemory.h標頭檔案中定義。

(3)xmlChar*與其他型別之間的轉換

在實際程式設計中，總是需要在xmlChar *和char *之間進行強制型別轉換，所以定義了一個巨集BAD_CAST，其定義如下：

#define BAD_CAST (xmlChar *)

(4)XML中常用到的重定義

在XML程式中，會經常看到xmlChildrenNode這個名稱，其實這個名稱是定義在tree.h中的重定義。其重定義如下：

#define xmlChildrenNode children

(5)文件型別xmlDoc、指標xmlDocPtr

xmlDoc是一個struct，儲存了一個xml的相關資訊，例如檔名、文件型別、子節點等，xmlDocPtr等於xmlDoc * 。與文件指標相關函式有如下幾個。

xmlNewDoc函式建立一個新的文件指標。

xmlParseFile函式以預設方式讀入一個UTF-8格式的文件，並返回文件指標。

xmlReadFile函式讀入一個帶有某種編碼的xml文件，並返回文件指標。

xmlFreeDoc釋放文件指標。特別注意，當呼叫xmlFreeDoc時，該文件所有包含的節點記憶體都會被釋放，所以一般來說不需要手工呼叫xmlFreeNode或者

xmlFreeNodeList來釋放動態分配的節點記憶體，除非把該節點從文件中移除了。一般來說，一個文件中所有節點都應該動態分配，然後加入文件，最後呼叫

xmlFreeDoc一次釋放所有節點申請的動態記憶體，這也是為什麼我們在程式中很少看見xmlNodeFree的原因。

xmlSaveFile將文件以預設方式存入一個檔案。

xmlSaveFormatFileEnc可將文件以某種編碼格式存入一個檔案中。

(6)節點型別xmlNode、指標xmlNodePtr

節點是XML中最重要的元素，xmlNode代表XML文件中的一個節點，實現為一個struct，此結構內容很豐富也很重要，其定義在tree.h中，具體說明如下：

typedef struct _xmlNode xmlNode;
typedef xmlNode *xmlNodePtr;
struct _xmlNode {
    void            *_private;                   /* application data */
    xmlElementType  type;                        /* type number, must be second ! */
    const xmlChar   *name;                       /* the name of the node, or the entity */
    struct _xmlNode *children;                   /* parent->childs link */
    struct _xmlNode *last;                       /* last child link */
    struct _xmlNode *parent;                     /* child->parent link */
    struct _xmlNode *next;                       /* next sibling link */
    struct _xmlNode *prev;                       /* previous sibling link */
    struct _xmlDoc  *doc;                        /* the containing document */
    xmlNs           *ns;                         /* pointer to the associated namespace */
    xmlChar         *content;                    /* the content */
    struct _xmlAttr *properties;                 /* properties list */
    xmlNs           *nsDef;                      /* namespace definitions on this node */
    void            *psvi;                       /* for type/PSVI informations */
    unsigned short  line;                        /* line number */
    unsigned short  extra;                       /* extra data for XPath/XSLT */
};

可以看到，節點之間是以連結串列和樹兩種方式同時組織起來的，next和prev指標可以組成連結串列，而parent和children可以組織為樹。同時此結構還有以下重要成員：

content：節點中的文字內容。

doc：節點所屬文件。

name：節點名字。

ns：節點的名字空間。

properties：節點屬性列表。

XML文件的操作其根本原理就是在節點之間移動、查詢節點的各項資訊，並進行增加、刪除、修改等操作。xmlDocSetRootElement函式可以將一個節點設定為某個文件的根節點，這是將文件與節點連線起來的重要手段，當有了根結點以後，所有子節點就可以依次連線上根節點，從而組織成為一個XML樹。

(7)XML屬性

XML屬性也是程式設計中經常用到的結構，其定義如下：

struct _xmlAttr {
    void               *_private;                /* application data */
    xmlElementType     type;                     /* XML_ATTRIBUTE_NODE, must be second ! */
　　const xmlChar      *name ;                   /*the name of the property */
    struct _xmlNode    *children;                /*the value of the property */
    struct _xmlNode    *last;                    /*NULL */
    struct _xmlNode    *parent;                  /*child->parent link */
    struct _xmlAttr    *next;                    /*next sibling link */
    struct _xmlAttr    *prev;                    /*previous sibling link */
    struct _xmlDoc     *doc;                     /*the containing document */
    xmlNs              *ns;                      /*pointer to the associated namespace */
    xmlAttributeType   atype;                    /*the attribute type if validating */
    void               *psvi;                    /*for type/PSVI informations */
}

(8)節點集合型別xmlNodeSet、指標型別xmlNodeSetPtr
節點集合代表一個由節點組成的變數，節點集合只作為XPath的查詢結果而出現，因此被定義在xpath.h中，其定義如下：

typedef struct _xmlNodeSet xmlNodeSet;
typedef xmlNodeSet *xmlNodeSetPtr;

struct _xmlNodeSet {
    int nodeNr;                            /* number of nodes in the set */
    int nodeMax;                        /* size of the array as allocated */
    xmlNodePtr *nodeTab;      /* array of nodes in no particular order */
};

可以看出，節點集合有三個成員，分別是節點集合的節點數、最大可容納的節點數，以及節點陣列頭指標。對節點集合中各個節點的訪問方法如下：
xmlNodeSetPtr nodeset = XPath查詢結果;

for (int i = 0; i < nodeset->nodeNr; i++)
{
 nodeset->nodeTab[i];
}

注：libxml主要函式說明 點選開啟連結

3、生成xml檔案例子

/*
 * =====================================================================================
 *
 *       Filename:  2.c
 *
 *    Description:  建立xml檔案
 *
 *        Version:  1.0
 *        Created:  2014年05月29日 19時04分37秒
 *       Revision:  none
 *       Compiler:  gcc
 *
 *         Author:  leiyu, [email protected]
 *        Company:  Class 1107 of Computer Science and Technology
 *
 * =====================================================================================
 */

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <libxml/xmlmemory.h>
#include <libxml/parser.h>

int main()
{

    xmlDocPtr doc = xmlNewDoc(BAD_CAST"1.0");
	xmlNodePtr root_node = xmlNewNode(NULL, BAD_CAST"root");
	xmlDocSetRootElement(doc, root_node);
	xmlNewTextChild(root_node, NULL, BAD_CAST "newNode1", BAD_CAST "newNode1 content");
	xmlNewTextChild(root_node, NULL, BAD_CAST "newNode2", BAD_CAST "newNode2 content");
	xmlNewTextChild(root_node, NULL, BAD_CAST "newNode3", BAD_CAST "newNode3 content");
	xmlNodePtr node = xmlNewNode(NULL, BAD_CAST "node2");
	xmlNodePtr content = xmlNewText(BAD_CAST "NODE CONTENT");
	xmlAddChild(root_node, node);
	xmlAddChild(node, content);
	xmlNewProp(node, BAD_CAST "attribute", BAD_CAST "yes");
	node = xmlNewNode(NULL, BAD_CAST "son");
	xmlAddChild(root_node, node);
	xmlNodePtr grandson = xmlNewNode(NULL, BAD_CAST "grandson");
	xmlAddChild(node, grandson);
	xmlAddChild(grandson, xmlNewText(BAD_CAST "This is a grandson node"));
	int nRel = xmlSaveFile("CreatedXml.xml", doc);
	if (nRel != -1)
	{
		xmlFreeDoc(doc);
		return 1;
	}
}

結果為：

<?xml version="1.0"?>
<root><newNode1>newNode1 content</newNode1><newNode2>newNode2 content</newNode2><newNode3>newNode3 content</newNode3><node2 attribute="yes">NODE CONTENT</node2><son><grandson>This is a grandson node</grandson></son></root>

4、解析xml的例子

需要解析的xml檔案：

<story>
	<storyinfo>
		<author>John Fleck</author>
		<datewritten>June 2, 2002</datewritten>
		<keyword>example keyword</keyword>
	</storyinfo>
	<body>
		<headline>This is the headline</headline>
		<para>This is the body text.</para>
	</body>
</story>

程式為：

#include <stdio.h>
#include <string.h> 
#include <stdlib.h> 
#include <libxml/xmlmemory.h> 
#include <libxml/parser.h> 

 void  parseStory (xmlDocPtr doc, xmlNodePtr cur)
  {  
	  xmlChar *key, *name;  
	  cur = cur->xmlChildrenNode;  
	  while (cur != NULL) 
	  {    
		  if ((!xmlStrcmp(cur->name, (const xmlChar *)"author")))
		  {
			  name = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
			  printf("author: %s\n", name);
			  xmlFree(name);
		  }
		  if ((!xmlStrcmp(cur->name, (const xmlChar *)"keyword"))) 
		  {       
			  key = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);   
			  printf("keyword: %s\n", key);
			  xmlFree(key);    
		  }  
		  cur = cur->next;
	  }
	  return;
  } 

static void  parseDoc(char *docname) 
{ 
	xmlDocPtr doc;  
	xmlNodePtr cur;   
	doc = xmlParseFile(docname);    
	if (doc == NULL )
	{
		fprintf(stderr,"Document not parsed successfully. \n");  
		return; 
	}   
	cur = xmlDocGetRootElement(doc);    
	if (cur == NULL)
	{
		fprintf(stderr,"empty document\n");   
		xmlFreeDoc(doc);
		return;  
	} 
	if (xmlStrcmp(cur->name, (const xmlChar *) "story")) 
	{ 
		fprintf(stderr,"document of the wrong type, root node != story");
		xmlFreeDoc(doc);
		return; 
	}  
	cur = cur->xmlChildrenNode;
	while (cur != NULL) 
	{ 
		if ((!xmlStrcmp(cur->name, (const xmlChar *)"storyinfo")))
		{
			parseStory (doc, cur);  
		}    
		cur = cur->next; 
	}    
	xmlFreeDoc(doc);  
	return;
} 

int  main(int argc, char **argv) 
{
	char *docname;   
	docname = "1.xml";
	parseDoc (docname); 
	return (1);
}

Libxml2的學習—生成、解析xml檔案

Libxml2的學習—生成、解析xml檔案

C++使用TinyXML生成和解析xml檔案

libxml2.7.8 c++ 解析xml檔案中文轉換

【dom4j 】dom4j 生成並解析xml檔案

關於Qt中QJsonObject、QJsonArray生成與解析JSON檔案

SSM 生成mapper中xml檔案：未能解析對映資源：“檔案巢狀異常

Spring學習(二)：Spring xml檔案格式、載入上下文六種方式及作用域

Java用String 擷取方式解析xml檔案、處理大xml檔案

dom4j生成/解析xml檔案

SAX方式解析XML檔案的方法分析,並取特定欄位生成物件

XML學習總結（三）——SAXReader解析xml檔案資料

Java之Pull方式生成xml檔案和解析xml檔案

java解析xml檔案：建立、讀取、遍歷、增刪查改、儲存

Ubuntu下C語言使用libxml2庫解析xml檔案

mybatis 自動生成實體類、mapper.xml檔案

Linux C]利用libxml2解析xml檔案

linux下使用libxml2庫，解析xml檔案

Android 解析、修改xml檔案

java 生成和解析xml

Java上傳且後臺解析XML檔案

Libxml2的學習—生成、解析xml檔案

相關推薦