SizectorS_SDK_DigitRec
应用简介
本项目基于SizectorS_SDK开发字符识别应用,通过读取离线数据mpdat文件来获取点云,使用的环境为Windows操作系统。
应用流程
加载离线数据
通过MPSizectorS_Utils中的Load函数来加载数据,通过ConvertToFloat3DFrame函数将数据格式转换成MPSizectorS_DataFrameFloat3DStruct类型的浮点数矩阵,从而进行后续处理。
MPSizectorS_DataFrameUndefinedStruct data;
bool dev = MPSizectorS_Utils::Load(&data, "E:/SizectorS_SDK_DigitRec/test/SizectorS_DataExport4.mpdat");
MPSizectorS_DataFrameFloat3DStruct float3DData = MPSizectorS_Utils::ConvertToFloat3DFrame(data);
获取图像信息
该步骤用于获取图像的尺度信息和深度信息,通过调用MPSizectorS_Utils::GetAutoColorDepthRange_Float3D()返回一个 MPSizectorS_ColorDepthRangeStruct 结构体类型的数据,方便进行后续处理
MPSizectorS_ColorDepthRangeStruct colorDepth;
int xPixResolution = data.FrameInfo.DataInfo.XPixResolution;
int yPixResolution = data.FrameInfo.DataInfo.YPixResolution;
MPSizectorS_Utils::GetAutoColorDepthRange_Float3D(&colorDepth, data.FrameInfo, float3DData.Data);
显示图像
通过GetGrayBitmap函数,将浮点数矩阵转换为灰度图像,并创建一个CV_8UC1类型的Mat对象inputImg来显示图像。另外,通过GetColorDepthBitmap函数,将浮点数矩阵转换为彩色图像,并创建一个CV_8UC3类型的Mat对象inputRGBImg来显示图像。
unsigned char* bitmapBuffer = new unsigned char[xPixResolution * yPixResolution];
unsigned char* redBuffer = new unsigned char[xPixResolution * yPixResolution];
unsigned char* greenBuffer = new unsigned char[xPixResolution * yPixResolution];
unsigned char* blueBuffer = new unsigned char[xPixResolution * yPixResolution];
MPSizectorS_Utils::GetGrayBitmap(&data, bitmapBuffer);
cv::Mat inputImg(yPixResolution, xPixResolution, CV_8UC1, bitmapBuffer);
MPSizectorS_Utils::GetColorDepthBitmap(&data, redBuffer, greenBuffer, blueBuffer, colorDepth.Max + 0.1, colorDepth.Min - 0.1);
// 使用三个数组创建一个彩色图像
cv::Mat inputRGBImg(yPixResolution, xPixResolution, CV_8UC3);
for (int i = 0; i < yPixResolution; ++i) {
for (int j = 0; j < xPixResolution; ++j) {
int index = i * xPixResolution + j;
inputRGBImg.at<cv::Vec3b>(i, j) = cv::Vec3b(
blueBuffer[index], greenBuffer[index], redBuffer[index]);
}
}
cv::namedWindow("inputImg", cv::WINDOW_NORMAL);
imshow("inputImg", inputImg);
cv::resizeWindow("inputImg", 800, 600);
cv::namedWindow("inputRGBImg", cv::WINDOW_NORMAL);
imshow("inputRGBImg", inputRGBImg);
cv::resizeWindow("inputRGBImg", 800, 600);
cv::waitKey();
建立基准平面
在输入图像上选择三个坐标点(xPoint1, yPoint1), (xPoint2, yPoint2), (xPoint3, yPoint3),并由这三个点建立一个平面。
unsigned int xPoint1 = 126;
unsigned int yPoint1 = 439;
unsigned int xPoint2 = 117;
unsigned int yPoint2 = 579;
unsigned int xPoint3 = 818;
unsigned int yPoint3 = 514;
MPSizectorS_DataPointFloat3DStruct point1 = {};
MPSizectorS_DataPointFloat3DStruct point2 = {};
MPSizectorS_DataPointFloat3DStruct point3 = {};
traverseNineGrid(&point1, xPoint1, yPoint1, float3DData);
traverseNineGrid(&point2, xPoint2, yPoint2, float3DData);
traverseNineGrid(&point3, xPoint3, yPoint3, float3DData);
Point a = { point1.X, point1.Y, point1.Z };
Point b = { point2.X, point2.Y, point2.Z };
Point c = { point3.X, point3.Y, point3.Z };
Point normal = plane(a, b, c);
图像二值化
在输入图像上按顺序选取点,以其到平面的距离作为该点的亮度值,并比较该亮度值和设定阈值,将大于阈值的点设为白色,其余的点设为黑色。选择完成后,将二值化后的图像保存到磁盘中。
// 创建一个显示区域同等大小的image
cv::Mat image(creatImageHeight, creatImageWidth, CV_8UC1);
// 修改每个像素的值,大于阈值为白色,其余为黑色
for (int i = 0; i < creatImageHeight; i++)
{
for (int j = 0; j < creatImageWidth; j++)
{
point4 = {};
flag = traverseNineGrid(&point4, j + xStart, i + yStart, float3DData);
if (flag) {
Point p = { point4.X, point4.Y, point4.Z };
float distance = distance_to_plane(normal, p, a, b, c);
if (distance >= binThreshold) {
image.at<uchar>(i, j) = 255;
continue;
}
}
image.at<uchar>(i, j) = 0;
}
}
保存并显示图像
cv::imwrite("ResultImg.png", image);
cv::imshow("ResultImg", image);
cv::waitKey();
OCR识别
通过Tesseract库进行OCR识别,将二值化后的图像作为输入图像进行字符识别。将识别结果保存在recognizedText变量中,输出识别结果。
///调用识别API
tesseract::TessBaseAPI* api = new tesseract::TessBaseAPI();
if (api->Init(NULL, "eng")) {
fprintf(stderr, "Could not initialize tesseract.\n");
exit(1);
}
Pix* img = pixRead("ResultImg.png");
api->SetImage(img);
char* recognizedText = api->GetUTF8Text();
std::string text(recognizedText);
std::cout << "识别结果:" << text << std::endl;
结束调用
关闭调用API,并释放指针地址
api->End();
delete[] bitmapBuffer;
delete[] redBuffer;
delete[] greenBuffer;
delete[] blueBuffer;
delete[] recognizedText;
pixDestroy(&img);
结论
- 本应用主要通过MPSizectorS_API和MPSizectorS_Utils函数库来完成数据的处理和转换,OpenCV库来处理图像并进行可视化展示,Tesseract库来进行字符识别。
- MPSizectorS_API:提供了对MPSizectorS测距仪器应用程序接口的支持。
- MPSizectorS_Utils:提供了对SizectorS_SDK_DigitRec的支持,处理数据转换和预处理等相关工作。
- OpenCV:提供了图像处理相关功能,包括Mat对象、imread和imwrite函数,imshow函数等。
- Tesseract:提供了OCR识别相关的功能。