GcExcel导出OFD方案（三）

Richard.Huang 发表于 2024-5-27 15:47:02

本帖最后由 Richard.Huang 于 2024-5-27 16:08 编辑

背景
经过前面两篇枯燥的依赖准备以及底层原理讲解，相比大家都有些跃跃欲试了，这篇文章，是咱们的完结篇，详细分享了如何在代码上来实现这样一个将大象装入冰箱的效果。

步骤
大致流程

1. 继承PDFGraphicsStreamEngine类，便于分析PDF数据图层和资源归类
public class OFDPageDrawer extends PDFGraphicsStreamEngine {
}2. 重写构造方法，分析pdf每页的资源，并初始化OFD生成器
/**
* 构造器，调用super(page),这个操作的目的是将page资源准备好，并且添加对应的操作符，当下一次调用processPage或者processPageContentStream时执行对应的操作符对应的操作
* @param idx
* @param page
* @param ofdCreator
* @param scale
* @throws IOException
*/
protected OFDPageDrawer(int idx, PDPage page, OFDCreator ofdCreator, float scale) throws IOException {
super(page);
this.page = page;
this.ofdCreator = ofdCreator;
ctLayer = this.ofdCreator.createLayer();
this.scale = scale;
}3. 重写drawImage方法收集整理pdf中分析出来的图片资源
/**
* 作用：将 PDF 图像对象转换为 OFD 格式进行绘制。此方法包括：
*
* 将图像写入字节流并保存。
* 根据当前变换矩阵计算图像在页面上的位置和大小。
* 创建 OFD 图像对象并设置其相关属性，然后添加到当前层中。
*
* @param pdImage
* @throws IOException
*/
@Override
public void drawImage(PDImage pdImage) throws IOException {
ByteArrayOutputStream bosImage = new ByteArrayOutputStream();
String suffix = "png";
ImageIO.write(pdImage.getImage(), suffix, bosImage);
String name = String.format("%s.%s", bcMD5(bosImage.toByteArray()), suffix);
ofdCreator.putImage(name, bosImage.toByteArray(), suffix);

// 根据当前变换矩阵计算图像在页面上的位置和大小，实际上就是将PDF中该图像的属性信息转换成OFD中的形式
Matrix ctmNew = this.getGraphicsState().getCurrentTransformationMatrix();
float imageXScale = ctmNew.getScalingFactorX();
float imageYScale = ctmNew.getScalingFactorY();
double x = ctmNew.getTranslateX() * scale;
double y = (page.getCropBox().getHeight() - ctmNew.getTranslateY() - imageYScale) * scale;
double w = imageXScale * scale;
double h = imageYScale * scale;
ImageObject imageObject = new ImageObject(ofdCreator.getNextRid());
imageObject.setBoundary(x, y, w, h);
imageObject.setResourceID(new ST_RefID(ST_ID.getInstance(ofdCreator.getImageMap().get(name))));
imageObject.setCTM(ST_Array.getInstance(String.format("%.0f 0 0 %.0f 0 0", w, h)));
setImageClip(imageObject, x, y, w, h);
ctLayer.add(imageObject);
}4. 通过继承PDFGraphicsStreamEngine类分析得到的文字内容重绘
public void addPageContent(int idx, CT_Layer ctLayer, float width, float height) {
PageDir pageDirInv = new PageDir();// 资源归类
pageDirInv.setIndex(idx);
org.ofdrw.core.basicStructure.pageObj.Page pageInv = new org.ofdrw.core.basicStructure.pageObj.Page();
CT_PageArea areaInv = new CT_PageArea();// ofd可视区域（pdf的裁剪区）
areaInv.setPhysicalBox(0, 0, width, height);
pageInv.setArea(areaInv);
Content contentInv = new Content();// 内容
contentInv.addLayer(ctLayer);
pageInv.setContent(contentInv);
pageDirInv.setContent(pageInv);
docDir.getPages().add(pageDirInv);
}5. 将收集到的资源进行打包生成OFD文件
/**
* 打包OFD文件包二进制数据
*
* @param virtualFileMap
* @return
* @throws IOException
*/
public static void zip(Map<String, byte[]> virtualFileMap,OutputStream output) throws IOException {
ZipArchiveOutputStream zaos = new ZipArchiveOutputStream(output);

for (Map.Entry<String, byte[]> entry : virtualFileMap.entrySet()) {
   zaos.putArchiveEntry(new ZipArchiveEntry(entry.getKey()));
   zaos.write(entry.getValue());
   zaos.closeArchiveEntry();
}

zaos.finish();
}
完整代码

使用步骤：
1. 下载文件到本地
2. 安装maven中的依赖
3. 打开/src/main/java/GrapeCity/PDF转OFD
4. 将源文件地址和导出文件地址进行更换
5. 右键启动GcExcel2OFD方法

特别鸣谢
1. PDF-Explained：https://zxyle.github.io/PDF-Explained/chapter3.html
2. 一文搞懂PDF格式：https://cloud.tencent.com/developer/article/1575759
3. https://gitee.com/gblfy/ofd-pdf?_from=gitee_search

页: [1]

葡萄城开发者社区's Archiver

GcExcel导出OFD方案（三）