
- SJ_Zhang
- May. 5, 2025
Abstract
Image sensors with internal computing capabilities fuse sensing and computing to significantly reduce the power consumption and latency of machine vision tasks. Linear photodetectors such as 2D semiconductors with tunable electrical and optical properties enable in-sensor computing for multiple functions. In-sensor computing at the single-photon level is much more plausible but has not yet been achieved. Here, we demonstrate a photon-efficient camera with in-sensor computing based on a superconducting nanowire array detector with four programmable dimensions including photon count rate, response time, pulse amplitude, and spectral responsivity. At the same time, the sensor features saturated (100%) quantum efficiency in the range of 405–1550 nm. Benefiting from the multidimensional modulation and ultra-high sensitivity, a classification accuracy of 92.22% for three letters is achieved with only 0.12 photons per pixel per pattern. Furthermore, image preprocessing and spectral classification are demonstrated. Photon-efficient in-sensor computing is beneficial for vision tasks in extremely low-light environments such as covert imaging, biological imaging and space exploration. The single-photon image sensor can be scaled up to construct more complex neural networks, enabling more complex real-time vision tasks with high sensitivity.
Introduction
Traditional image sensors and computing units are separated. Therefore, a large amount of data obtained by the image sensor must first be converted into digital signals through analog-to-digital conversion and temporarily stored in the memory, and then transferred to the local computing unit or cloud computing system. This process leads to high power consumption and latency. Emerging in-sensor computing1,2 can alleviate the above problems. With the in-sensor computing architecture, a single reconfigurable sensor or multiple interconnected sensors can directly sense and process information, which eliminates a large amount of redundant data transmission and integrates sensing and computing functions3,4,5,6. At present, in-sensor computing based on two-dimensional material photodetectors have achieved great results, such as image classification7, spectral resolution8, motion perception9, and image preprocessing10,11,12. However, current in-sensor computing is generally based on sensors in linear working mode with low sensitivity, requiring a long integration time to achieve desired results. The problem will be even worse in low-light environments. Increasing the sensitivity of in-sensor computing to the single-photon level is beneficial for improving photon utilization, thereby simplifying vision tasks in extremely low-light environments, such as covert imaging13, biological imaging14,15, and space exploration16,17.
Among existing image sensors, single-photon detectors are undoubtedly the most photon-efficient, especially the superconducting nanowire single-photon detector (SNSPD) that has developed rapidly in the past 20 years. SNSPD has the advantages of high detection efficiency (>98%)18,19, low dark count rate (10−4 cps)20, low timing jitter (3 ps)21, fast responding speed (<1 ns)22, and wide operating band23,24 (ultraviolet to mid-infrared). Based on various multiplexing schemes, SNSPD arrays have recently expanded from 1024 pixels25,26 to 400,000 pixels27.
Here, we demonstrate a photon-efficient camera with in-sensor computing based on a multidimensional programmable superconducting nanowire array that can simultaneously sense and process images projected onto the chip to realize various vision tasks, such as image classification, image preprocessing, and spectral resolution. Two computing architectures of PCR computing are constructed based on the S-shaped photon count rate (PCR) curve. When the PCR increases nonlinearly with the bias current, the computing is based on the total count rate, so that the signal-to-noise ratio is high and collection is convenient. When the PCR is saturated, the response time and pulse amplitude increase nonlinearly with the increase of the bias current, the computing is based on the total integrated area of pulses, which can further improve photon utilization. Photon-efficient camera with in-sensor computing is expected to be applied in nondestructive biological imaging and identification. Furthermore, the sensor’s wide operating band gives it broad prospects in high-precision astronomical detection.
Results
As shown in Fig. 1a, conventional imaging methods require collecting and storing data from all pixels for postprocessing to achieve various vision tasks, which is highly redundant. In contrast, the readout signals of in-sensor computing directly correspond to the results of image classification, image preprocessing, and spectral classification, which can reduce the pressure of data readout and postprocessing. Figure 1b takes image classification as an example to show the principle of in-sensor computing based on an SNSPD array. Different color maps correspond to different convolution kernels and the weights of the convolution kernels correspond to the bias current of different pixels. The SNSPD array has multiple programmable dimensions (Fig. 1c), including PCR, response time, pulse amplitude, and spectral responsivity. According to the output signals f1, f2, f3 obtained by loading each set of kernels, the letter projected on the sensor can be identified. The bias current is updated during training based on the error between the label and the output (the training details are shown in Supplementary Notes 1–4).
Fig. 1: Conventional imaging and in-sensor computing architectures.
a Conventional imaging and postprocessing process. b In-sensor computing using a superconducting detector. The different color maps correspond to different convolution kernels. The size of the kernel is 5 × 5 × N, and N represents the number of multiplexed dimensions. Functions implemented by in-sensor computing include image classification, image preprocessing, and spectral classification. c Superconducting nanowire arrays with multiple programmable dimensions, including photon count rate, response time, pulse amplitude, and spectral responsivity. d Two different computing architectures, red represents PCR computing and blue represents area computing, and i represents the number of convolution kernels.
A digital micromirror device (DMD) is used to modulate the collimated laser to generate different patterns (Fig. S28). Laser is incident through the bottom window, focused and illuminated onto the sensor. A homemade three-terminal readout circuit is used to load the DC bias current to the sensor, and then the response signals of all pixels are synthesized (using multichannel TDC/ADC for data acquisition and then summing the data) and read out. Figure 1d shows two computing architectures of PCR and pulse integrated area. CR1 ~ CRi correspond to the readout signals of programmable PCR, A1 ~ Ai correspond to the pulse integrated area based on the programmable response time and pulse amplitude, and i is the number of convolution kernels.
The superconducting nanowire array consists of 25 pixels, and the performance of each pixel is consistent. The PCR of the superconducting nanowire remains unchanged as the bias current increases28,29, indicating that the quantum efficiency reaches saturation (100%). As shown in Fig. 2a, b, all pixels of the sensor maintain saturated (100%) quantum efficiency at wavelengths of 405 nm (4–8 μA) and 1550 nm (7.5–8 μA). The shapes of the count rate curves of the 25 pixels are consistent, and the differences in saturation count rates are caused by the nonuniformity of the light spot during the characterization process (in subsequent experiments for in-sensor computing, we use a light spot with nearly uniform light intensity). The sensor generates positive and negative pulses (Fig. 2c) respectively under positive and negative bias currents and their PCRs are consistent. For the incident light at 405 nm, when the bias current is below 4 µA, the PCR increases with the increase of the bias current, which enables computing based on the programmable PCR. When the bias current is 4–7 µA, the PCR does not change while the response time and pulse amplitude increase as the bias current increases (Fig. 2c), so the pulse integrated area can be used for computing. As the incident light intensity increases, the photon count of each pixel increases linearly (Fig. 2d), which is an inherent property of single-photon detectors.
Fig. 2: Photon detection performance of the sensor.
Photon count rate and dark count rate of 25 pixels of the sensor, the positive and negative bias currents correspond to the same count, a 405 nm b 1550 nm. c The sensor generates positive and negative pulses respectively under positive and negative bias currents, and the pulse width and pulse amplitude increase as the bias current increases. d With the increase of incident light intensity, the photon count of each pixel increases linearly. The photon counts of all pixels are displayed stacked.
Image classification
PCR computing is used to achieve 26-letter classification, and the distribution of bias current after training is shown in Fig. S5. Owing to the crosstalk between different micromirrors of the DMD, it can be seen from Fig. 3a, b that the quality of the direct acquisition image of some letters is poor. The low signal-to-noise ratio makes it difficult to distinguish some letters, but the overall classification accuracy (Fig. S6a) is still greater than 90%. When the average photon number per pixel per pattern (PPP) is 10.9, the classification accuracy of most letters is above 95% (Fig. 3c).
Fig. 3: Experimental results of image classification.
a Original projected image of 26 letters. b Directly acquired images of 26 letters. c Experimental confusion table for 26-letter classification (10.9 photons per pixel per pattern). d, e Comparison of the classification accuracy of “NJU” using PCR computing and area computing. d Results of the normal projected image. e Results of eliminating crosstalk caused by DMD projection. The peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) between the direct acquisition results and the original patterns are listed. f Directly acquired images of “NJU” with 0.12 photons per pixel per pattern (the crosstalk signals have been removed).
Further analysis is conducted on the classification of “NJU”. (see Fig. S2 for more directly acquired images). The background light and the crosstalk of DMD cause the image to change dynamically and lead to poor image quality. The peak signal-to-noise ratio between the acquired image and the projected image is generally around 10 dB. PCR computing only requires a single-channel counter for acquisition. Fig. S6b compares the classification accuracy when using positive bias current and using positive and negative bias current. The accuracy of using positive bias current is higher when PPP is less than 1. However, after increasing the number of photons, its accuracy is lower than that of positive and negative bias current and cannot reach 100%. Therefore, positive and negative bias currents are used in subsequent computing. The simulation data set and the experimentally acquired data set are used for training, respectively, to optimize the bias current matrix (Fig. S6d). The simulation data set simulates noise caused by background light and light intensity fluctuations. The accuracies based on the simulation data set and the experimentally acquired data set are basically consistent, which shows that the generalization ability of the classification network is great. When PPP is less than 2, the accuracy corresponding to the experimentally acquired data set is slightly higher. Although the acquisition of PCR computing is convenient, it requires multiple pulses to improve the classification accuracy. The area computing utilizes two programmable dimensions of recovery time and pulse amplitude, and the weight is reflected on a single pulse (Fig. S6c). In addition, all bias currents correspond to saturated quantum efficiency, so area computing requires less PPP. Figure 3d, e compares the accuracy of PCR computing and area computing and finds that area computing requires less PPP for the same accuracy. For normal acquisition, when PPP is 0.56, the classification accuracy of area computing is 90.2% (Fig. 3d), and the classification accuracy of PCR computing is 83.42%. When multichannel acquisition is used to eliminate the crosstalk signals caused by the optical path (Fig. 3e), the classification accuracy of both architectures is improved. When PPP is only 0.12, the classification accuracy of area computing can reach 92.22%. The classification accuracy of area computing shows a step-like curve as the average photon number changes, whereas the classification accuracy corresponding to PCR computing changes smoothly with the average photon number. In most cases, the classification accuracy of area computing is better than that of PCR computing for the same number of detected photons (Fig. 3d, e). When the average number of photons is very small (Fig. S20), the area computing is greatly affected by noise photons, resulting in a sharp decrease in accuracy, and its accuracy will be lower than that of PCR computing. Figure 3f shows the image collected directly when the PPP is 0.12 (after eliminating crosstalk, the results without eliminating crosstalk are shown in Fig. S23). Here, 0.12 PPP corresponds to 9 photons of the 25-pixel array. No letter features can be found in Fig. S23, while some features can be seen from Fig. 3f, but the letters in both figures cannot be accurately distinguished by the human eye. The optimized weights make it possible to classify images on the basis of local features, but the human eye needs more information to confirm the image category, which indicates that the photon utilization of in-sensor computing is greater than that of normal acquisition.
Image preprocessing
In addition to image classification, we demonstrate reconfigurable in-sensor image preprocessing. The left side of Fig. 4a is the original projection image, and the right side is the image acquired directly by the sensor. The intensity of images is related to PCR computing. Due to the difference in light intensity of each pixel, the image directly acquired has obvious blocky effects. The reconfigurable convolution kernel here refers to modulating the PCR of each pixel by varying the bias current. Three convolution kernels are given in Fig. 4b–d. The blue/red squares represent positive/negative bias currents and white square represents no bias current. The colors from light to saturated represent increasing absolute values of bias currents. We demonstrate three image preprocessing operations including Gaussian filtering, edge enhancement, and image sharpening. The results of simulation and experiment are consistent. The edge enhancement and sharpening operations partially alleviate the blocky effect of the image. In-sensor image preprocessing helps save computing resources and makes it possible to achieve real-time stylized imaging in extremely low-light environments. Subsequent optimization of the network and sensor may enable real-time image denoising and other complex functions.
Fig. 4: Reconfigurable in-sensor image preprocessing and spectral classification.
a Projected original image and corresponding image acquired directly by the sensor. The intensity of image is related to PCR. Image preprocessing with three different operations, b Gaussian filtering, c Edge enhancement, d Image sharpening. The operating kernels modulate the PCR of each pixel by independently changing the bias current. Blue/red squares represent positive/negative bias currents and white square represents no bias current. Colors from light to saturated represent increasing absolute values of bias currents. Experimental results of distinct operations are compared with simulations. e, f PCR as a function of bias current. e wavelengths from 405 to 1550 nm. f wavelengths from 1490 to 1590 nm. g Convolution kernel for wavelength classification and corresponding PCR. S1-S5 represent the total counts of five pixels per row. After normalizing S1-S5, set the value greater than 0.4 to 1 and the value less than 0.4 to 0, so that S1–S5 = 11111 corresponds to 405 nm, S1–S5 = 01111 corresponds to 650 nm, S1–S5 = 00111 corresponds to 1064 nm, S1–S5 = 00011 corresponds to 1310 nm, S1–S5 = 00001 corresponds to 1550 nm.
Spectral classification
The superconducting nanowire array has intrinsic spectral resolution. As the bias current changes, the PCR curves of different wavelengths are different. Figure 4e shows the PCR curves of five wavelengths from 405 nm to 1550 nm. In near-infrared, the sensor’s spectral resolution can reach 20 nm (Fig. 4f), and its resolution can be better after optimizing the optical path. Based on the array device, we only need to obtain PCRs at a set of bias currents to determine the wavelength of the incident photon. Figure 4g shows the specifically optimized bias current matrix. The output signals S1-S5 can be obtained by summing the PCRs of five pixels in each row. The discrimination matrix (Fig. 4g) can be obtained by normalizing S1-S5 at different wavelengths. Set the value greater than 0.4 to 1 and the value less than 0.4 to 0, so that S1–S5 = 11111 corresponds to 405 nm, S1–S5 = 01111 corresponds to 650 nm, S1–S5 = 00111 corresponds to 1064 nm, S1–S5 = 00011 corresponds to 1310 nm, S1–S5 = 00001 corresponds to 1550 nm. Spectral classification can be combined with image classification to improve classification accuracy. Moreover, the introduction of spectral resolution can provide more image preprocessing operations, such as radiometric correction and hyperspectral remote sensing image processing.
Discussion
To summarize, we demonstrate a photon-efficient in-sensor computing camera using a 25-pixel SNSPD. Each pixel of the sensor has saturated quantum efficiency at wavelengths from 405 nm to 1550 nm. Based on the S-shaped PCR curve, we construct two computing architectures. PCR computing requires only a single-channel counter for acquisition. Area computing utilizes two programmable dimensions of recovery time and pulse amplitude, so the number of photons required for computing is extremely low. When the PPP is only 0.12 in area computing, the classification accuracy of the three letters “NJU” can reach 92.22%. In addition to image classification, we also demonstrate image preprocessing and spectral classification. The above operations are all completed inside the sensor, and the serial output signals are the result we want without postprocessing. The sensor currently has four programmable dimensions including PCR, response time, pulse amplitude, and spectral resolution. The intrinsic polarization response characteristics of SNSPD can be added and all these programmable dimensions can be jointly optimized to further enrich the functions of in-sensor computing.
In normal acquisition, each pixel collects all information with the same photoresponsivity, and the acquired data are weighted during postprocessing. However, in-sensor computing loads the weights to each pixel in advance, so the utilization rate of photons under ideal conditions is higher than that of normal acquisition. In addition, only a portion of the pixels are working at the same time, resulting in lower power consumption. We characterize the classification accuracy of normal acquisition based on the same sensor, and its classification accuracy is better than that of in-sensor computing (Fig. S21). The reason for this anomalous result is that the information obtained by normal acquisition is more complete, and the weights that can be loaded by postprocessing are more accurate than those by in-sensor computing. In addition, in-sensor computing is more susceptible to various types of noise and crosstalk. However, the entire process of normal acquisition is cumbersome and requires data storage and offline postprocessing. By optimizing programmable dimensions to provide more precise weights in the future, better results than normal acquisition can be achieved.
For specific tasks, a precoded sensor can be constructed based on the reconfigurable sensor. Loading optimized weights into the sensor can improve computing efficiency. We have previously developed a pulse-encoded SNSPD array30, in which micron inductor lines of different lengths are connected in series to each pixel to change the kinetic inductance and AC impedance. Therefore, the pulse areas of different pixels are different under the same bias current. (The circuit diagram is shown in Fig. S19 and the specific architecture is explained in Supplementary Note 9.) In addition, for large-scale array sensors, we can also use a row-column multiplexing31 scheme to modulate the bias current, thereby achieving reconfigurable coding with fewer electrical channels.
In-sensor computing can alleviate the pressure of data readout and processing of array sensors, but currently, the functions based on the sensor itself are limited. In the future, we will combine array sensors with on-chip diffraction neural networks32,33 to enrich computing capabilities. Superconducting optoelectronic synapses34 constructed by combining SNSPD with Josephson junctions have been proposed recently. Further combination of our computing architecture with superconducting optoelectronic synapses35 will be able to achieve more complex visual functions.
The extremely low operating temperature of superconducting sensors cannot be ignored, but with the development of refrigeration technology, the volume and power consumption of refrigeration equipment have decreased36. With the exploration and development of new superconductors, the operating temperature of sensors is constantly increasing37. The development of CMOS-compatible processes38 and low-temperature CMOS circuits39,40 will make on-chip signal processing of superconducting sensors more convenient.
Methods
Modeling of the sensor
The response model of the sensor is\(\,R\left(x,y,I\right)={QE}\left(x,y,I\right)\cdot {Ab}\left(x,y\right)\cdot C\left(x,y\right)\cdot \left[S\left(x,y\right)+B\left(x,y\right)\right]+{DCR}(x,y,I)\), where R is the response count of each pixel under different bias currents, QE is the intrinsic detection efficiency (quantum efficiency) of the pixel, Ab is the light absorption efficiency of the pixel, C is the coupling efficiency of the entire optical system, S is the number of incident photons, B is the number of background photons, and DCR is the dark count of the pixel. x and y represent the position of the pixel in the array, and I is the operating (bias) current. Optical path loss and coupling efficiency are not considered here, so Ab and C are set to 1.
$${{{\rm{Model}}}}\,{{{\rm{of}}}}\,{{{\rm{PCR}}}}\,{{{\rm{computing}}}}\!\!:\,PCR(x,y,I)=R(x,y,I)$$
$${{{\rm{Model}}}}\,{{{\rm{of}}}}\,{{{\rm{area}}}}\,{{{\rm{computing}}}} \! \! :\,Area(x,y,I)=R(x,y,I)\cdot A(x,y,I),\\ {{{\rm{where}}}}\,A\,{{{\rm{is}}}}\,{{{\rm{the}}}}\,{{{\rm{pulse}}}}\,{{{\rm{area}}}}\,{{{\rm{of}}}}\,{{{\rm{each}}}}\,{{{\rm{pixel}}}}\,{{{\rm{under}}}}\,{{{\rm{different}}}}\,{{{\rm{bias}}}}\,{{{\rm{currents}}}}.$$
Weights of in-sensor computing: Three sets of bias current matrices (corresponding to the weights of the three channels of the convolution kernel) are obtained through pretraining of a single-layer convolutional network. The optimized weight (absolute value) distribution is 0–1. In the experiments, the loadable weight range of the PCR computing is 0.2–1, and the loadable weight range of the area computing is 0.6–1. Therefore, we set all weights in the range of 0.6–1.
The target image can be directly acquired via multichannel acquisition, so the process of in-sensor computing can be simulated based on the directly acquired image. As shown in Fig. S21, the classification accuracies of in-sensor computing and postprocessing after normal acquisition are compared. The classification accuracy of normal acquisition is greater, which may be because the information obtained by normal acquisition is more complete and the in-sensor computing is affected by the crosstalk signal (area computing) and low quantum efficiency (PCR computing). Owing to the low accuracy of the bias current, there are differences between the weights of the PCR/area computing and the optimized weights. The residual statistics between the weight matrices of the two computing architectures and the optimized weight matrix are shown in Fig. S22c. The classification accuracies corresponding to different weight matrices are compared (Fig. S22a, b), and it is found that weight differences had little effect on classification accuracy.
Based on Monte Carlo simulation, a more detailed numerical simulation of in-sensor computing is shown in Supplementary Note 7.
Device fabrication and characterization
We fabricate a 36-pixel SNSPD array from a 6 nm thin niobium nitride (NbN) film. The peripheral electrodes [Ti (10 nm)/Au (100 nm)] are prepared using magnetron sputtering and lift-off. Meandering nanowires are patterned by electron beam exposure and transferred from resist to NbN by reactive ion etching. The line width and spacing of the nanowires are 80 nm and 120 nm, respectively. The SEM image of nanowires is shown in Fig. S26. As shown in Fig. S27, the superconducting critical currents of 36 pixels are uniform. Finally, we select the 25 pixels in the upper left corner for the experiment.
Electrical measurement
A homemade multichannel bias and amplification circuit is used to independently operate 25 pixels. In order to facilitate analysis in PCR computing, a multichannel time-to-digital converter is used to record the response signals of 25 pixels. For area computing, the 25-pixel signals are synthesized and collected by an oscilloscope.
Pattern projection
The 405 nm laser passes through the fiber collimator and is irradiated on the DMD (Fig. S28). The patterns are written into the DMD controller in advance and switched through the control software. The light spot modulated by DMD passes through the focusing lens and 405 nm filter and then illuminates the sensor through the bottom window of the GM refrigerator. The sensor is mounted on the cold stage with temperature of 2.3 K. Analysis of DMD projection patterns is shown in Supplementary Note 8.
Data availability
The data supporting the findings of this study are available within the article and its Supplementary Information. Source data are provided with this paper.
Slide
May. 16, 2025
Slide
May. 16, 2025