If you have a requirement of manipulating office documents programmatically in a platform independent way, i.e. with out office interop assemblies, NPOI in one of the best choices available.
NPOI is .NET avatar of ever popular Apache POI project. It continues to be open source library for manipulating any Office file formats. If you have ever used Apache POI's Java implementation, NPOI would be straight forward.
For example if you want to print the entire text in a docx file, below is a such an example function :
public static void PrintAllText(string path)
{
Stream is1 = new System.IO.FileStream(path, FileMode.Open, FileAccess.Read);
XWPFDocument doc = new XWPFDocument(is1);
int i = 0;
while (i < doc.BodyElements.Count)
{
var runText = (doc.BodyElements[i] as XWPFParagraph).Text;
Console.WriteLine(runText);
i++;
}
}
Another example if you want to read a word and get all the images from a .docx file, below is one such sample
public static void ReadImages(string filePath)
{
string path = filePath;
Stream stream = new System.IO.FileStream(path, FileMode.Open, FileAccess.Read);
XWPFDocument doc = new XWPFDocument(stream);
int i = 0;
var list = doc.AllPictures;
while (i < list.Count)
{
XWPFPictureData data = list[i];
Console.WriteLine("Name" + data.FileName);
Console.WriteLine("Details" + data.ToString());
i++;
}
}
Here are some of the important links for NPOI
NPOI is .NET avatar of ever popular Apache POI project. It continues to be open source library for manipulating any Office file formats. If you have ever used Apache POI's Java implementation, NPOI would be straight forward.
For example if you want to print the entire text in a docx file, below is a such an example function :
public static void PrintAllText(string path)
{
Stream is1 = new System.IO.FileStream(path, FileMode.Open, FileAccess.Read);
XWPFDocument doc = new XWPFDocument(is1);
int i = 0;
while (i < doc.BodyElements.Count)
{
var runText = (doc.BodyElements[i] as XWPFParagraph).Text;
Console.WriteLine(runText);
i++;
}
}
Another example if you want to read a word and get all the images from a .docx file, below is one such sample
public static void ReadImages(string filePath)
{
string path = filePath;
Stream stream = new System.IO.FileStream(path, FileMode.Open, FileAccess.Read);
XWPFDocument doc = new XWPFDocument(stream);
int i = 0;
var list = doc.AllPictures;
while (i < list.Count)
{
XWPFPictureData data = list[i];
Console.WriteLine("Name" + data.FileName);
Console.WriteLine("Details" + data.ToString());
i++;
}
}
Here are some of the important links for NPOI