Session Ⅱ

Low-Cost Computing and Perception Methods (Submission Deadline: Nov. 15, 2025)

低功耗计算与感知方法

Chair:

Kun Hu
Beihang University, China

Co-chairs:

Juan Zhang	Baochang Zhang	Junbiao Pang	Xiaofei Chang	Xupei Zhang
Beihang University, China	Beihang University, China	Beijing University Of Technology, China	Northwestern Polytechnical University, China	Xidian University, China

Keywords:

Topics:

Foundation Models

(智能基础模型)
Efficient Computation

(高效计算)
Model Compression and Pruning

(模型压缩与剪枝)
Heterogeneous System Deployment

(异构平台部署)
Cross-domain Application

(跨领域应用)

Efficient Computation and Acceleration Techniques for Vision Foundation Models

(基础模型的高效计算与加速方法)
Model Compression, Pruning, and Quantization for Vision Tasks

(模型压缩、剪枝及量化在视觉任务中的应用)
Deployment and Optimization of Visual Models on Heterogeneous Platforms (Edge/Mobile/Cloud)

(视觉模型在异构硬件平台（边缘/移动/云）上的部署与优化)
Lightweight Model Design for Resource-Constrained Environments

(面向资源受限环境的轻量级模型设计)
End-to-End Development and Evaluation of Visual Intelligence Systems

(端到端视觉智能系统的开发与评估)
Cross-domain Transferability, Generalization, and Robustness Studies

(跨领域迁移、泛化与鲁棒性研究)
Adaptation and Integration of Visual Models Driven by Multi-source Heterogeneous Data

(多源异构数据驱动的视觉模型适应与集成)
Real-world Applications of Visual Models in Industrial Inspection and Automation

(工业检测与自动化中的视觉模型实用化)
Dynamic Inference Mechanisms and Adaptive Computation Schemes

(动态推理机制与自适应计算方案)
Applications of Vision Foundation Models in Robotic Perception and Control

(视觉基础模型在机器人感知与控制中的应用)
Spatial Intelligent Computing and vision navigation of Robotics

(机器人空间智能计算与视觉导航)

Summary:

Vision foundation models, such as Vision Transformer (ViT) and Swin Transformer，ConvNeXt , have achieved remarkable success in a wide range of visual tasks including image classification, object detection, and semantic segmentation, becoming a dominant paradigm in computer vision. Despite their strong representational and generalization capabilities, their large parameter volumes and high computational and storage demands present substantial challenges for real-world deployment, especially when considering limited-resource scenarios such as edge computing platforms.

This session focuses on recent advances and challenges in efficient computation, heterogeneous platform deployment, and cross-domain applications of visual models. Topics of interest include innovative techniques for reducing computational costs—such as model compression, pruning, and dynamic inference—adaptation and optimization of models for deployment across diverse hardware environments, and studies on the transferability, robustness, and generalization of vision models in practical applications like smart cities, medical image analysis, and industrial inspection. We encourage submissions that address these challenges and contribute to the development of scalable, efficient, and versatile visual intelligence systems for broad real-world impact.
人工智能基础模型，如Vision Transformer (ViT)和Swin Transformer，ConvNeXt等，在图像分类、目标检测和语义分割等多种视觉任务中取得了显著成功，成为计算机视觉领域的主流范式。尽管这些模型具有强大的表示和泛化能力，但其庞大的参数量以及高计算和存储需求在实际应用中带来了巨大挑战，尤其是在资源有限的场景中，如边缘计算平台。

本次专题将重点讨论高效AI模型在高效计算、异构平台部署和跨领域应用方面的最新进展和挑战。我们关注的主题包括降低计算成本的创新技术，如模型压缩、剪枝和动态推理，适应和优化模型以在多样化硬件环境中部署，以及研究视觉模型在智慧城市、医学图像分析和工业检测等实际应用中的可迁移性、鲁棒性和泛化能力。我们鼓励提交能够应对这些挑战并推动可扩展、高效和多功能视觉智能系统发展的研究，以实现广泛的实际影响。