CNC vision research log

← Back to Kevin's homepagePublished: 2019 Sept 14Last updated: 2019 Oct 30

I’m playing around with computer vision in the context of CNC machining. Partly to improve my machining workflow and material efficiency, but mostly to goof with linear algebra and computer vision.

Scope + goals:

Note: This page isn’t a formal writeup — it’s a research log that includes lots of technical incantations, hacks, and dead-ends.


TODO


2019 Oct 30

Last month I realized that 2D was insufficient and I needed full 3D reconstruction. I’m aware of two possible broad directions:

I first decided to explore the former. A newsletter reader pointed me to the 2011 Kinect Fusion research work. Googling “Kinect Fusion RealSense” led me to several implementations and related work:

Open3D

I started here because it was well documented and had out of the box RealSense camera support. Following the example in the docs I tried to scan a 3D scene consisting of a tape measure, metal doweling jig, and wooden rod:

Reconstruction RGB input frame

I captured 1237 frames from waving my RealSense SR300 camera around and the steps below took about 12 minutes on my i7 Mac Mini (no CUDA GPU).

pip3 install joblib pyrealsense2

# record frames; had to edit source to request lower resolution 640x480 depth stream
cd V:\open3D\examples\Python\ReconstructionSystem\Sensors

python .\realsense_recorder.py --record_imgs
cd ..
python run_system.py config/realsense.json --make
python run_system.py config/realsense.json --register
python run_system.py config/realsense.json --refine
python run_system.py config/realsense.json --integrate

I used MeshLab to openup the resulting 187 MB model, which unfortunately turned out terribly:

Digging in more:

2019 Sept 29

Video timestamps:

My main focus over the past week has been to take precisely located images of the CNC machine bed and combine them to form a high-resolution composite image. Work included:

Next steps

Based on the early stitching results, I think the inherent geometry of the setup will always give too much perspective distortion to yield a reasonable orthographic-looking composite image. The best way forward may be to embrace 3D geometry (using the depth sensor and/or structure from motion algorithms) rather than trying to construct an orthographic 2D composite.

Misc. notes


2019 Sept 22

Updates since last week:

Next steps


2019 Sept 14

Current status:

Early explorations on computer vision and Inventor COM programming were done over a few pairing sessions with Geoffrey Litt; check out their notes.

I have a few technical constraints:

and I’ve explored several technical architectures for this project.

Python + OpenCV

Since I’ve never been able to manage Python deps on Mac/Linux (every time I Google, there’s a new “solution”), I assumed it’d be terrible on Windows. But then I found PyInstaller, which I got working in 2 minutes: pyinstaller --onefile my_program.py creates a self-contained EXE. Yay! The resulting EXE is 5 MB for a “Hello World”, but importing OpenCV balloons it to 50 MB, which I’m not thrilled about.

I’m using the win32com Python library. It works, but is slow — takes about 1s to read the coordinates of a few dozen points from an Inventor sketch. TBD whether that’s Python/COM stuff or inherent to Inventor inter-process COM.

For exploring Python and new libraries, it’s been helpful to have automatic code reloading. This script automatically reloads the cncv module and runs/“benchmarks” its main function. This lets me change code and immediately see the results without having to switch windows and restart anything manually.

import importlib
import time
import os

import cncv as target_mod

LAST_MODIFY_TIME = 0

def maybeReload(module):
    global LAST_MODIFY_TIME
    lt = os.path.getmtime(module.__file__)
    if lt != LAST_MODIFY_TIME:
        LAST_MODIFY_TIME = lt
        try:
            importlib.reload(module)
            print("reloaded!")
        except Exception as e:
            # swallow exception so that reloader doesn't crash
            print(e)


while(True):
    maybeReload(target_mod)

    start = time.process_time()
    try:
        target_mod.main()
    except Exception as e:
        # swallow exception so that runner doesn't crash
        print(e)
    dt = time.process_time() - start
    # print(dt)
    time.sleep(1)

Rust

Since the vision logic I needed is pretty basic linear algebra, I thought I’d try doing it in pure Rust, without OpenCV. I tried to cross compile from Mac to Windows following this post but it required copying some files from the Visual Studio install on Windows to my Mac — at which point I decided I might as well just compile on Windows.

I then tried implementing basic linear algebra in Rust. However, I found nalgebra quite frustrating: There are a lot of types to write out, which makes it much more verbose than Python/NumPy. E.g.,

let mut center = points
    .iter()
    .fold(Vec2::<f64>::new(0.0, 0.0), |c, x| c + x.coords);
center.apply(|x| x / points.len() as f64);

versus

center = np.mean(points, axis=0)

Also (because types) there’s no 8x8 Matrix constructor, so I gotta write:

let a = MatrixMN::<f64, U8, U8>::from_row_slice(&[1.0, ...])

compared to Python/NumPy’s

a = np.array([[1.0, ...], ...])

Another blocker in Rust was that after a few hours of research, I couldn’t figure out how to access a COM API. I probably have to use winapi (which has a struct called CoInitializeEx, a COM thing), but docs aren’t great and my searches for COM “Hello world” examples on Github fell short.

C++ and OpenCV

I don’t know C++ and am not thrilled about the idea of picking it up. (I learned Rust specifically because I didn’t want to learn about C++ footguns.)

However both the Inventor SDK sample AddIns and many OpenCV examples are written in C++.

I first had to compile OpenCV with static libraries (I did all this on OS X, since I’m more familiar there and assume I could do it again in Windows if it seemed promising):

brew install cmake pkg-config jpeg libpng libtiff openexr eigen tbb
git clone https://github.com/opencv/opencv && git clone https://github.com/opencv/opencv_contrib
cd opencv && mkdir build && cd build

cmake -D CMAKE_BUILD_TYPE=RELEASE \
    -D CMAKE_INSTALL_PREFIX=/opt/opencv \
    -D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib/modules \
    -D BUILD_opencv_python2=OFF \
    -D BUILD_opencv_python3=OFF \
    -D INSTALL_PYTHON_EXAMPLES=OFF \
    -D INSTALL_C_EXAMPLES=OFF \
    -D OPENCV_ENABLE_NONFREE=OFF \
    -D BUILD_SHARED_LIBS=OFF \
    -D BUILD_EXAMPLES=OFF ..

make -j

I then compiled this aruco-markers demo. I had to add/update these lines in CMakeLists.txt:

set(OpenCV_DIR /opt/opencv/lib/cmake/opencv4)
find_package (Eigen3 3.3 REQUIRED NO_MODULE)
target_link_libraries(detect_markers opencv_core Eigen3::Eigen opencv_highgui opencv_imgproc opencv_videoio opencv_aruco)

then run

cd detect_markers && mkdir build && cd build && cmake ../ && make

to spit out a binary — it’s 37 MB, but upx shrinks it down to 16MB.

Another potential reason to go C++ is to compile an Inventor AddIn, which runs “in-process” in Inventor and may be faster than using the COM API.

Open questions

Rust / COM API: It’s gotta be possible. Can anyone point to an example (either in Rust, C, or C++)?

Better camera: I don’t know much about optics, but for the problem of precise measurement of still objects under controlled lighting, I think I just want as many megapixels as possible. What’s the most cost-effective to do that? My more detailed question on Computer Vision Reddit didn’t turn up any leads.

Next steps

I’ll continue in Python+OpenCV since it’s working, which outweighs my (largely aesthetic) concerns about performance and distribution size.

Other notes

I bought a RealSense SR305 $79 coded light depth camera. I tried streaming USB from the control computer to the design computer using VirtualHere, but while the USB camera was detected on design computer, Intel’s viewer software didn’t pick it up. (The viewer does pick it up if the camera is directly connected.) Maybe because VirtualHere doesn’t support USB3?

To compile Rust on Windows, you need Visual Studio Build Tools C++ tools. There is way to download package in advance / offline install — it must be done in Visual Studio installer.

When compiling Rust, set powershell env variable so cargo compiles to VM desktop rather than network drive. (https://github.com/rust-lang/rust/issues/54216#issuecomment-448282142)

[Environment]::SetEnvironmentVariable("CARGO_TARGET_DIR", "C:\Users\itron\Desktop")

Also set powershell to be like emacs:

Set-Executionpolicy RemoteSigned -Scope CurrentUser
notepad $profile

and add this line:

Set-PSReadLineOption -EditMode Emacs

Inventor ships with an SDK installer (in C:\Users\Public\Documents\Autodesk\Inventor 2020\SDK), but the installer won’t unpack anything unless it detects Visual Studio — and it’s not forwards compatible! I unpacked manually via

 msiexec /a developertools.msi  TARGETDIR=C:\inventor_sdk

so that I could access the header files and Visual Studio templates.

Trying to compile the sample Inventor AddIn failed with “RxInventor.tlb not found”, but changing InventorUtil.h to have this line:

#import "C:\Program Files\Autodesk\Inventor 2020\Bin\RxInventor.tlb" no_namespace named_guids...

resolved the issue, and the addin compiled and connected to Inventor.

Python COM will throw an exception if i.ReferencedFileDescriptor is accessed for SketchImage i while that SketchImage is being resized in Inventor.